Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active March 25, 2024 16:27
Show Gist options
  • Star 29 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@UkoeHB
Copy link

UkoeHB commented Nov 30, 2022

Security takes precedence over speed.

You keep saying this is a matter of security. How is the current scheme insecure? Calling cipher on a block of bytes is always secure if your cipher key has enough entropy. Ciphering doesn't work if you have a bad IV because you can get duplicate ciphered blocks between messages. In our case, the index is an IV so the first cipher pass always produces a unique sequence of bytes, which means the second cipher pass (that is equivalent to ciphertext stealing, not something I invented) always produces a unique sequence of bytes.

@tevador
Copy link
Author

tevador commented Dec 1, 2022

"Encrypt-then-MAC" is a standard construction that has a formal security proof.

The use of ciphertext stealing (which is a form of encryption) to provide a MAC in combination with custom padding is non-standard. In general, block ciphers cannot replace hashes. I guess it would have a higher chance of passing review if you could provide a formal security proof, but it's still problematic for conservative security because it's using a non-standard algorithm for performance reasons, while the performance gain is not really needed.

@UkoeHB
Copy link

UkoeHB commented Dec 1, 2022

Looks like there is something called CBC-MAC. Although that construction is a bit different from what we are doing, it still sets a precedence for building a MAC with a cipher.

The only security requirements our scheme needs are: A) bits of the cipher-tag secret can't be leaked, B) all bits of the address tag are pseudo-randomly dependent on all bits of the address index and all bits of the cipher-tag secret (there is no need for an explicit dependency on constants like the MAC). (A) is trivially met by using an established block cipher and a 32-byte cipher-tag secret. (B) is trivially met by the current scheme's construction. Other requirements of a standard MAC don't apply because we are just using the MAC as a hint, so false positives are acceptable/expected unlike in a standard setting (i.e. a MAC-pass that produces an invalid address index). False negatives are acceptable as well because our system assumes any party can mangle data freely (although 'false negative' is kind of a misnomer when dealing with a mangled address or badly constructed enote).

@tevador
Copy link
Author

tevador commented Dec 1, 2022

Your MAC is constructed by leaking 16 bits of the "plaintext", which is very different from CBC-MAC. This seems to be closer to "Encrypt-and-MAC", where the MAC is calculated from the plaintext, but in your case the MAC is the plaintext at the same time.

Even if the MAC construction was provably secure, your solution is not Pareto-optimal because you can get the same security and up to 10x higher speed by using AES instead of Twofish. Blake2 is the conservative choice, AES is the fast choice. Twofish is less conservative than Blake2 and less performant than AES. I would still choose Blake2.

@UkoeHB
Copy link

UkoeHB commented Dec 1, 2022

Your MAC is constructed by leaking 16 bits of the "plaintext", which is very different from CBC-MAC.

I said it sets a precedent, that's all. We should stop calling it a MAC anyway, because it's just a 'hint' not a MAC. In fact, I will update the code today to change this so we can avoid future misunderstandings about the security properties of the address tag.

Even if the MAC construction was provably secure, your solution is not Pareto-optimal because you can get the same security and up to 10x higher speed by using AES instead of Twofish.

This seems like a disingenuous argument considering your own claim that twofish is more than fast enough. A fast C-only twofish implementation is superior to all the C-only AES implementations I tested, which means it will perform well regardless of hardware. Saying twofish is not 'Pareto-optimal' for our use-case is a textbook example of bikeshedding.

@tevador
Copy link
Author

tevador commented Dec 1, 2022

I would be fine with this if blake2b wasn't 3.5x slower than twofish

Your agument against using Blake2b implies that performance is your primary concern. I'm simply pointing out that if performance was the primary goal, AES would be a better choice than Twofish. I still think Blake2 is fast enough for the use case. Can you point out a scenario when a Blake2-based MAC would be a bottleneck?

Also using Twofish adds more cryptographic code that will need to be reviewed. Is that not a valid concern? Both Blake2b and AES are already part of the Monero codebase.

blake2b ... 3.5x slower than twofish ... using a keyed hash; the unkeyed hash is 1.8x slower

I think your keyed hash benchmark is wrong. There should be no performance difference between keyed and unkeyed Blake2b unless the key changes for every hash. If the key doesn't change, you can reuse the hash state after the first compression function call. So the real performance difference would be 1.8x in this case.

@UkoeHB
Copy link

UkoeHB commented Dec 1, 2022

Your agument against using Blake2b implies that performance is your primary concern.

It's not quite this simple. We need a cipher algorithm (in my opinion) for encrypting the address index. If we are already using a cipher algorithm, then it is not crazy to extend its use to cover the address tag hint as well as the address index. If it so happens that doing so is faster than pulling in a hash function for encrypting the hint, that's a win in my book.

Both Blake2b and AES are already part of the Monero codebase.

There is not a fast AES implementation in the Monero codebase. OAES sucks.

I think your keyed hash benchmark is wrong.

Possibly, but reusing the hash state doesn't seem to be helping more than ~7% for small hash data sizes. You can check the tests here and run the test cases with ./build/Linux/seraphis_lib/release/tests/performance_tests/performance_tests --filter=\*blake2b\* --stats

@tevador
Copy link
Author

tevador commented Dec 1, 2022

It seems that blake2b_init_key doesn't actually call the compression function. You can force that by prepending a zero byte. This should result in a ~2x speed-up.

diff --git a/tests/performance_tests/blake2b.h b/tests/performance_tests/blake2b.h
index 3057dbc..bdc11ea 100644
--- a/tests/performance_tests/blake2b.h
+++ b/tests/performance_tests/blake2b.h
@@ -93,6 +93,10 @@ public:

       if (blake2b_init_key(&m_hash_state, hash_length, derivation_key.data, 32) < 0)
         return false;
+
+      char c = 0;
+      if (blake2b_update(&m_hash_state, &c, sizeof(c)) < 0)
+        return false;
     }
     else
     {

(Note that it is possible to achieve the same without prepending the zero. This is just to overcome the lazy invocation in the current implementation.)

@UkoeHB
Copy link

UkoeHB commented Dec 1, 2022

It seems that blake2b_init_key doesn't actually call the compression function. You can force that by prepending a zero byte. This should result in a ~2x speed-up.

Ok that worked. In practice it would probably be fine to just paste the cipher secret into the transcript to avoid workarounds like this. Looks like the only practical difference is the key_length parameter gets set in keyed mode (the key bytes are consumed in a different order but that's less relevant).

I think you could avoid the lazy invocation by changing this S->buflen + inlen > BLAKE2B_BLOCKBYTES to S->buflen + inlen >= BLAKE2B_BLOCKBYTES in blake2b_update(). Not sure we want to be hacking on blake2b though.

@tevador
Copy link
Author

tevador commented Dec 2, 2022

I think you could avoid the lazy invocation by changing this S->buflen + inlen > BLAKE2B_BLOCKBYTES to S->buflen + inlen >= BLAKE2B_BLOCKBYTES in blake2b_update(). Not sure we want to be hacking on blake2b though.

That would also require changes in blake2b_final to avoid calling the compression function twice when hashing a multiple of the block size. Using the key directly in the transcript should also be secure (Blake2 is not vulnerable to length-extension attacks).

In any case, since the performance difference between Blake2 and Twofish is relatively small, I think we should use Blake2 for the MAC.

  1. It's a standard construction, easier to reason about and more likely to be accepted by reviewers.
  2. The encryption would simplify to a single block in ECB mode, which is also provably secure.
  3. It's a similar construction as the view-tag. The address tag MAC is basically a "level 2 view tag".

@UkoeHB
Copy link

UkoeHB commented Dec 5, 2022

In any case, since the performance difference between Blake2 and Twofish is relatively small, I think we should use Blake2 for the MAC.

Ok in the interest of resolving this conversation, I updated the library to use blake2b for the address tag hint. The cost in the current implementation seems to be a 2x increase in time to compute the hint (there may be minor perf improvements on the table). @tevador it would be helpful if you could review the updated implementation here.

@j-berman
Copy link

j-berman commented Dec 5, 2022

Last comment on using less than 16 bytes for the address index...

Re-using an address is a privacy degradation, one that's even worse for light wallet users as it reveals received enotes to the light wallet server. Even though 12-14 bytes would offer a very comfortable margin for random address generation, it would be below the standard threshold for guaranteed unique random numbers. This is why I'm still most comfortable with an option that leaves 16 bytes for random address generation for all users (even for the user that use accounts), though yes, one can argue I'm being irrationally paranoid given the way we expect users to use addresses. I don't have more to say on this front. I'm fine moving on from this line of reasoning, but figured it's worth expressing the view.

@tevador
Copy link
Author

tevador commented Dec 5, 2022

@tevador it would be helpful if you could review the updated implementation here.

Thanks. Looks good to me. I find the code more elegant and easier to understand than the previous version.

@SChernykh
Copy link

The base32 encoding uses the character set ybndrfg8ejkmcpqxot1uwis2a345h769.

What is this character set? Googling ybndrfg8ejkmcpqxot1uwis2a345h769 only brings up this very document.

I found http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt but it lists a different set ybndrfg8ejkmcpqxot1uwisza345h769 (z-base32). Either we go fully compatible with z-base32 and all code written for it, or just use something entirely different and better fit for this specific use case.

@tevador
Copy link
Author

tevador commented Dec 12, 2022

It's z-base32 with the character z replaced by 2. We need all digits since the address version is explicitly included in the "human readable header". It still satisfies all of the design criteria of z-base32 as their only argument against 2 was:

'2' is potentially mistaken for 'z' (especially in handwriting).

Of course you can propose a different character set if you think it's more suitable.

@SChernykh
Copy link

I already foresee many confused devs taking a glance at this set, plugging their z-base32 code and wondering why it doesn't work. Maybe it's better to rearrange it back to sorted set abc...789? The only reason they reordered it in the original proposal was to have an "easy" character at the end, and in this case it's not even at the end becase 3 wallet keys are in the middle.

@tevador
Copy link
Author

tevador commented Dec 12, 2022

The only reason they reordered it in the original proposal was to have an "easy" character at the end, and in this case it's not even at the end becase 3 wallet keys are in the middle.

It's true that the order is probably not important in our case.

Maybe it's better to rearrange it back to sorted set abc...789

This could also cause confusion as someone will see abcde... and will assume RFC 4648. It might be better to use a completely random permutation.

@SChernykh
Copy link

Maybe use xmr...789 and the rest in alphabetic order :)

@tevador
Copy link
Author

tevador commented Dec 12, 2022

We can try xmrbase32cdfghijknopqtuwy1456789

@SChernykh
Copy link

That will do

@kayabaNerve
Copy link

I'd personally prefer to user bech32m exactly. I see tevador wants a longer checksum. which I'm not here to dismiss entirely, but it's the easiest to integrate. Barring that, a standardized bech32 character set (bech32's or z-base32, which I'm only now hearing of) makes sense.

The HRP, as defined in bech32m, is free of the character set encoding. Accordingly, it's not an issue if it has chars not in the charset. I question if we can take the same approach here and accordingly use an already standardized base32 charset.

@tevador
Copy link
Author

tevador commented Dec 12, 2022

I'd personally prefer to user bech32m exactly.

Bech32(m) uses a polynomial optimized for short addresses. The specs even prohibit addresses longer than 90 characters [1], so it's not useful for us. The bech32 charset order was also optimized specifically for their polynomial, which has some improved bit-error detection capability [2].

If we want to match or exceed the error detection capability of the 32-bit hash used in CryptoNote addresses, we need at least 7 checksum characters, which is an awkward size for a BCH code. So a degree-8 BCH code is the best option, in my opinion.

The HRP, as defined in bech32m, is free of the character set encoding

The HRP encoding was a necessity for bech32, so that an address can have a prefix with the name of your favorite dog-meme-based cryptocurrency, while still using the same character set as bitcoin. We don't need to support other cryptocurrencies, so having a charset tailored for us removes the complexity around having to encode the HRP in order to calculate the checksum.

In short, Monero is so distinct from Bitcoin and other cryptocurrencies that we need to design our own encoding and I think we have enough manpower to do so.

[1] https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#bech32
[2] https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#cite_note-4

@DangerousFreedom1984
Copy link

I think the ideas in the 4th proposition 'Account index encoded in the output public key' are pretty nice and that we should go in this direction. I have some issues though with K1 depending on G now as you propose.

K1 = Ks + (i / kacct_g + kaddr_g) G + kaddr_x X + kaddr_u U

It means that for i != 0 there will be a term depending on G for the one-time address? If so it means that the enote-image creation will have to take into account this number when generating the blinding factor tk in Ko' = tkG + kaX+ kbU. Also if you mean G to be an independent base from the ones we are using then the composition proofs wouldnt work I guess, since it is meant to be exactly for three bases and not four, right? So I would suggest letting K1 = Ks + xX where x contains the information of the addresses and accounts and we keep the structure as it is. Otherwise the linking tag would be impacted and the whole protocol would have to change I guess. Am I missing something and saying some bs?

I think a better proposition would be just letting x = kaddr^(kaddr*i) or something like that. I didn't think much about the function. What do you think?

@hyc
Copy link

hyc commented Dec 14, 2022

Sorry for jumping in late. I don't see that anyone has asked this yet - why not use base64 instead of base32? That would make the anonymous addresses about 30 characters shorter. Since the addresses are already so long we don't expect people to type them in manually or read them aloud, the reduced ambiguity of eliminating mixed-case doesn't seem to me to be a valid argument for base32. And base64 would still be a prime power, still viable for BCH code. The impact on QR codes is kind of a wash, alphanumeric QR at 5.5 bits per symbol vs binary base64 at 6 bits per symbol, but fewer symbols.

@SChernykh
Copy link

QR will use 8 bits per symbol in case of base64.

@hyc
Copy link

hyc commented Dec 14, 2022

Right, ok. So for 181 symbols of base32 it will use 996 bits.
For 150 symbols of base64 it will use 1200 bits. Still easily manageable for a QR code.

Is there any reason we can't just use raw 8-bit bytes for the QR code?

@hbs
Copy link

hbs commented Dec 14, 2022

Should base64 be seriously considered, please stick to base64url otherwise the / and + will make it harder to generate URLs containing addresses.

@tevador
Copy link
Author

tevador commented Dec 14, 2022

It means that for i != 0 there will be a term depending on G for the one-time address? If so it means that the enote-image creation will have to take into account this number when generating the blinding factor

There will be a G-component regardless of the value of i (check the formula carefully). This is done to provide perfect forward secrecy against quantum attackers. Explained in this comment.

why not use base64 instead of base32

With base64, we'd lose the ability to double-click-select an address, which is bad for UX. As for the length, there is not much qualitative difference between 164 and 196 characters, both are way too long to type but easy to copy-paste.

It would be possible to use something like base59, which would save roughly 28 characters, but that would be highly non-standard and require more complex checksum code (mod 59 math instead of simple XOR).

Is there any reason we can't just use raw 8-bit bytes for the QR code?

There are interoperability issues with binary QR codes. Most readers expect a string formatted as an URI. For example, Android apps can define custom handlers for specific URI schemes: https://developer.android.com/training/app-links/deep-linking

@DangerousFreedom1984
Copy link

Oh, thanks. It is a bit confusing looking at the specs in the main post, the seraphis pdf and the comments at the same time. Maybe we could have a place where we find the updated notations then? Or could you update the specs?

@tevador
Copy link
Author

tevador commented Dec 14, 2022

Yes, the specs here are somewhat outdated. I'm planning to update it.

@hyc
Copy link

hyc commented Dec 14, 2022

With base64, we'd lose the ability to double-click-select an address,

Why is that?

@SChernykh
Copy link

Because of / and + characters, they act as delimiters when you double click on a text.

@hyc
Copy link

hyc commented Dec 14, 2022

@tevador
Copy link
Author

tevador commented Dec 14, 2022

Base64url also gets delimited on the - character.

@hbs
Copy link

hbs commented Dec 14, 2022

Unfortunately I don't think this is the case, base64url replaces / and + with - and _ but most devices will consider - as a separator and will not extend past it when double clicking text.

@hyc
Copy link

hyc commented Dec 15, 2022

Not to sidetrack this too much, but you can easily use _ and ~ for this and no problem.

@tevador
Copy link
Author

tevador commented Dec 15, 2022

I tested a few web browsers and text editors and all of them end selection on ~.

I think most software uses the definition of the regex word character placeholder \w, which expands to [a-zA-Z_0-9], to determine the selection span. There are just 63 characters in that set, so it seems to be impossible to have a base64 alphabet that has the double-click select feature.

@SChernykh
Copy link

I don't want base64 because of i1lI (these are 4 different characters), and probably a few other ambiguous sets of symbols.

@tevador
Copy link
Author

tevador commented Dec 30, 2022

FYI, certified addresses have been removed from this specification. Instead, I added similar functionality to the new payment URI proposal.

@UkoeHB
Copy link

UkoeHB commented Dec 31, 2022

Comments on latest updates:

  • 2.2: "If provided with a list of wallet addresses, this tier can also link [non-self-send] outputs to those addresses."
  • 2.2.1: Address tags mean the total speedup is ~2^24 for unowned enotes.
  • 3.2: I updated the implementation to include your 'personalization string'. The transcript prefix is coded as monero. I put domain-separation stuff in mostly lower-case for consistency, and only use upper case for external concepts like CLSAG. The KDF transcript builder is supposed to be as efficient as possible, so prefixes and domain separators don't have padding (that way all current uses of the KDF builder can fit in one blake2b block).
  • 3.2: For simplicity I mandate 32-byte keys for key derivation. There shouldn't be a case where we need 64-byte keys (even though blake2b permits it).
  • 3.2: For clarity I really recommend the notation H_x[k](data) for secret derivation. The comma is easily confused with || (I use them interchangeably), and it is less obvious when you are doing secret derivation vs a plain hash.
  • 3.3: I have been using xk for X25519 privkeys, xK for pubkeys, and xG for the generator. It is slightly verbose, but adds conceptual clarity and memorability that d, D, B do not have. 'Use whatever letter is unused' isn't always the best approach IMO - and note that B overloads the cryptonote spend key notation while D is confusingly closely related to {A, B} cryptonote address notation.
  • 3.3: I guarantee that I (and probably everyone else too) will continuously forget which is which between KeyDerive1() and KeyDerive2(). In code these are sp_derive_x25519_key() and sp_derive_key() respectively.
  • 3.3.1: You may want to clarify that x25519 scalars are not x*8 mod l, but instead are x mod l << 3. This way the mul8 automatically 'travels with' scalars.
  • 3.3.2: I decided to eliminate spaces from all domain separators. The U and X are now hashed from seraphis_U and seraphis_X. U = 10948b00d2de50b576998c11e83c59a79684d25c9f8a0dc6864570d797b9c16e, X = a4fb43ca695e12998802a20a158f12ea79474fb9012116956a69767c4d41110f
  • 4.2.1: KeyDerive1() should be KeyDerive2() here (good example how the names are confusing).
  • 4.3.1: Make a note that k_v is not migrated to k_vb because k_vb has more authority than k_v. In multisig it's migrated because otherwise you need a new setup ceremony (you do need a migration ceremony in order to get the new base spend key k_m U, but it doesn't require all group members to participate in case one of them became unavailable after initial account setup - you'd need all group members to make a new k_vb key).
  • 4.4: There are several typos in the key names in the Derivation column.
  • 4.6: The table here is inconsistent/incomplete compared to section 2.2 in your URI proposal (that proposal's table is correct/complete). Also, the master tier should include k_vb, since it isn't always derived from k_m.
  • 4.6.2: A remote scanner doesn't need to compute the nominal spend key, only check view tags and decipher address tags. The downstream client then checks the address tags and recomputes the sender-receiver secret if needed. Owned enotes will have some duplicate work done (just the DH exchange), but otherwise the work done and data transmitted by a remote scanner are minimized.
  • 4.7: FYI I have been calling K_s = k_vb X + k_m U the 'jamtis spend key', and k_m U the 'seraphis core spend key'.
  • 4.7: The Name column collides with section 4.4 naming. In general I try to use 'privkey' and 'pubkey' to disambiguate them.
  • 5.1.2: The hash string for encrypting address tags is "monero" || "jamtis_address_tag_hint" || k || cipher[k](j), since we don't want to use blake2b's keyed hash mode which adds an extra compression round, and we don't want to use the seraphis transcript builders which have some allocation overhead.
  • 5.5.2: Coinbase transcations don't have this 2-output optimization.
  • 6.3: I recommend citing your research and here too.
  • 7: It may be a while before test vectors can be fully locked down. The call stack for addresses needs to be fully reviewed, including transcript and config stuff - e.g. I just added transcript prefixing based on your comments. Vectors will all break on a single character change in any of the transcripts.

@tevador
Copy link
Author

tevador commented Dec 31, 2022

  • 3.2: I updated the implementation to include your 'personalization string'. The transcript prefix is coded as monero. I put domain-separation stuff in mostly lower-case for consistency, and only use upper case for external concepts like CLSAG. The KDF transcript builder is supposed to be as efficient as possible, so prefixes and domain separators don't have padding (that way all current uses of the KDF builder can fit in one blake2b block).

The "personalization string" is not a prefix in the transcript. It is a field in the Blake2b parameter block. Refer to section 2.8 of the Blake2 paper. It essentially domain separates the whole hash function. It might not be strictly needed, but it has zero cost.

@UkoeHB
Copy link

UkoeHB commented Jan 1, 2023

I see, the personal parameter. I'd rather not bake a customization like this so deep into the hash function wrappers, better to keep it simple and just use a domain separator. Adding a customization means all downstream projects have to implement that customization correctly.

@tevador
Copy link
Author

tevador commented Jan 1, 2023

I'd rather not bake a customization like this so deep into the hash function wrappers

It's supported by the C API of the blake2 library. It's not like we'd be changing the hash function internals.

better to keep it simple and just use a domain separator

Using the personalization is actually the simpler solution because you wouldn't need to have a separate implementation for view tags. It could also prevent future inconsistencies when someone could use the unkeyed hash and forget to add the "monero" prefix.

3.3: I have been using xk for X25519 privkeys, xK for pubkeys, and xG for the generator. It is slightly verbose, but adds conceptual clarity and memorability that d, D, B do not have.

That makes sense when writing code, but I've never seen a math notation that uses multi-letter variables. Someone could mistake xK for scalar multiplication of x and K. Both subscript and superscript indices are already in use in some places, so something like xk / xK can't be used (and are very hard to distinguish). Also the letter X is already used as one of the generators.

note that B overloads the cryptonote spend key notation while D is confusingly closely related to {A, B} cryptonote address notation.

Notations don't need to be globally unique, just unambiguous in the scope of this document (which doesn't use the CryptoNote address notation).

I'll try to implement your remaining comments in the next revision.

@UkoeHB
Copy link

UkoeHB commented Jan 1, 2023

It's supported by the C API of the blake2 library. It's not like we'd be changing the hash function internals.

What I mean is you can no longer just call blake2b() to hash, you need a custom sequence of API calls.

Using the personalization is actually the simpler solution because you wouldn't need to have a separate implementation for view tags. It could also prevent future inconsistencies when someone could use the unkeyed hash and forget to add the "monero" prefix.

I like having it be more explicit, which makes it more visible. Also, I updated the transcripts so the prefix is a constructor parameter that just defaults to the config "monero". This way the parameter is injected and not mandated. You could inject it to the hash functions too, but that would be more messy I think.

I've never seen a math notation that uses multi-letter variables

I was thinking of a left superscript: xk, xK, xG

Notations don't need to be globally unique, just unambiguous in the scope of this document

Sure, but this is implemented in the Monero codebase, and cryptonote will always be with us.

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2023

I am working on seraphis knowledge/audit proofs with @DangerousFreedom1984 and ran into some issues with enote ownership proofs and address index proofs.

  • enote ownership proof: prove that an enote is owned by a specific user address (transitively, the owner of that address owns the enote)

Any proof method you come up with (A. subtract K_1 from Ko and make a composition proof on the remainder; B. expose the sender-receiver secret q and allow the verifier to recompute Ko from K_1 [only works for non-selfsends]) can be spoofed by the prover if they know the private keys of the real address, since the K_1 used in the proof can be freely defined. Spoofing means making a proof that an enote was sent to a particular address when the original sender sent it to a different address.

To get around that problem, I propose updating the sender extension to include the key K_1 that is being extended (e.g. k_{g, sender} = H_n("..g..", K_1, q, C). Then you can expose q and K_1 and the verifier can recreate Ko and be confident that K_1 owns the enote. Note that this proof doesn't provide a way for you to prove an address doesn't own an enote, all it says is 'if you make a valid proof, then the K_1 in that proof is accurate'.

Another issue is you can't use that approach to make selfsend enote ownership proofs, because q is used without a secondary secret (the baked key) when constructing amount commitments and encoded amounts (meaning you can't make a selfsend enote ownership proof without exposing the amount). Moreover, any such proof would have to reveal that an enote is a selfsend type (no type-agnostic proof).

To solve that I propose updating selfsend enote construction so it mimicks normal enotes more closely. The only changes needed are adding a selfsend baked key to amounts (baked_key_selfsend = H_32[k_vb](q); for consistency, update the normal one to baked_key_plain = H_32(xr xG) so that both baked keys will have the same serialization pattern [random 32 bytes]), and encrypting address tags the same way as normal enotes (instead of encrypting the raw index). Changing those things actually simplifies the protocol a little by isolating per-type customization to just the construction of secrets q and the baked key.

  • address index proof: prove that an address was generated from a particular index

There is currently no way to prove an address was constructed from a particular index without exposing s_ga. I propose changing the address extensions to H_n(K_s, j, H_32[s_ga](j)) where K_s = k_vb X + k_m U (and H_n_x25519(K_s, j, H_32[s_ga](j)) for the xK_2 and xK_3 modifiers). Then an address index proof for {K}_j will expose K_s, j, and secret H_32[s_ga](j). The user can then do another proof on K_s to show the private keys are known, or do a composition proof with the address {K}_j.

EDIT: These changes have been implemented.

@jeffro256
Copy link

I have a concern with mixing the "find-received" tier (k_fr) and "generate-address" tier (s_ga & K_s). Having access to both these tiers allows more than the sum of these tiers, namely the ability to 100% recognize owned incoming enotes (basically the "payment validator" tier w/o knowing the amounts). If the shared secret used to encrypt the address tag is a function s_sr2, then the nominal one-time address K_'o would only be calculable on the "payment validator" tier.

I suggest using the following method for creating the encrypted address tag in an enote: addr_tag_enc = addr_tag XOR H_ate(s_sr1 || s_sr2 || Ko). Under this scheme, the "generate-address" tier can still generate any public address with the same information, but can't decrypt the encrypted address tags.

There's two real-life issues that I can imagine that this change would fix. Let's say that you wanted to create a social payment app, like Venmo, in which the backend both calculates and filters view tags to speed up scanning, as well as generates new receive addresses for people who want to send money to their users. Without changing the address tiers, this service would be able to identify all owned enotes of their users w/ ease. Another scenario in which this change would increase security is a merchant server system where the find-receive keys and generate address keys are spread across user-facing servers for quick & responsive invoice generation. If a malicious actor gains access to both key tiers, then they can generate addresses and see all incoming transactions whereas under the modified scheme, they can only generate addresses and calculate view tags.

@UkoeHB
Copy link

UkoeHB commented Aug 14, 2023

@jeffro256

  1. If decrypting the address tag requires s_sr2, that invalidates the performance benefit of k_fr scanning, because clients of a remote scanner now have to compute the baked key 1/(k^j_a ∗ k_ua)) ∗ K_e.
  2. The baked key actually depends on the address index j, so what you describe is logically impossible.

@jeffro256
Copy link

jeffro256 commented Aug 19, 2023

Okay I've looked into the 3 main privacy issues I've had with Jamtis deeper, and have a proposal. Thanks to @UkoeHB for the guidance thus far! I modified the Jamtis section of Ukoe's "Implementing Seraphis" paper with the details and uploaded it to Ufile since its a little more fleshed out than this doc. See below for a high-level view of the proposal.

Jamtis Change: Fix F-R Privacy Issues and New View Tag Tier

Pros

  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes to known public addresses.
  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes sent to a public address that is used more than once.
  • Third-parties can now compute view tags and generate public addresses on behalf of users without the ability to learn any additonal balance recovery information.
  • There are now two tiers of view tag wallets that users can pick between depending on their desire of balance/privacy: dense (1 byte) and sparse (2 bytes).

Cons

  • Public address raw size is increased by 30 bytes (48 characters if encoded using base32) (Additional +32 bytes for new public key, -2 bytes to remove decipher hint). Transactions remain the same size.
  • Light wallet scanning is slower on the client side (each deciphering op is replaced with DH op)
  • Additional spec complexity
  • Some other things I'm not currently seeing

Description of changes

Account secrets

Instead of one find-receive key k_fr, there are now two keys: the dense view key k_dv and the sparse view key k_sv. Instead of a base pubkey K_fr, there is now the dense view pubkey K_dv = k_dv * K_ua and sparse view pubkey K_sv = k_sv * K_ua.

Public address

The new public address now contains 4 pubkeys instead of 3 and does away with the decipher hint. The four pubkeys are labeled Kj_s, Dj_ua, Dj_dv, and Dj_sv. Kj_s is the same as Kj_1 in the old address scheme, while Dj_ua, Dj_dv, and Dj_sv are equal to their respective base pubkeys multiplied by the address private key. The ciphered address index c^j stays the same and Dj_ua is the same as the old Kj_3. To summarize, the new address tuple is [Kj_s, Dj_ua, Dj_dv, Dj_sv, c^j].

DH exchanges

The ephemeral pubkey K_e is calculated K_e = r * Dj_ua by the sender like normal (remember that Dj_ua is functionally identical to the old Kj_3), but now there are two DH keys: the dense DH key Kdv_d = r * Dj_dv = k_dv * K_e and the sparse DH key Ksv_d = r * Dj_sv = k_sv * K_e.

View tags

There are now two view tags (dense_view_tag and sparse_view_tag) per enote which are functions of their respective DH keys. In keeping with the old scheme, the dense view tag replaces the old view tag at 1 byte in size, and the sparse view tag replaces the decipher hint at 2 bytes in size. These view tags are completely independent of each other so combining the checks multiplies the amount of filtering done to 1:16777216. A user can choose to reveal either k_dv or k_sv (but not both, explained later) to a light wallet server to pre-scan the enotes for them. Unlike the old k_fr, knowing only one of k_dv or k_sv does not allow a third-party to perform any process in the balance recovery process except for recomputing view tags, which is good for privacy (also explained later).

Sender-Receiver Secret s_sr1

The sender-receiver secret s_sr1 is now computed as s_sr1 = Hsr1n(Kdv_d || Ksv_d || K_e || input_context). Notice that both DH keys are needed to compute s_sr1. This point is crucial to the privacy properties of the new scheme.

Wallet Tiers

There are now two tiers for view tag computation: dense and sparse. They can do nothing else besides compute their respective view tags. The find-receive tier can identify all incoming enotes, but not view amounts. The payment-validator tier remains the same in terms of capabilities. There are 2 new "compound tiers" which are the combination of the dense/sparse view tag tier and the generate address tier, which do exactly as expected without additional privacy drawbacks.

How the New Changes Address Privacy Issues

The core of the first two privacy issues mentioned in the "pros" section stem from the fact that the ability to decrypt address tags was tied to the ability to perform view tag computation. Since address tags are 1) public and 2) constant for a given address, third-parties with knowledge of k_fr can make extremely strong guesses about users' ownership of enotes under loose conditions. This new scheme decouples those two things so that a third-party can compute view tags for a user but not learn any additional information about enotes. To decrypt "address tags" (its now just the ciphered address index) under this new scheme, a third-party must know both k_dv and k_sv, since those are needed to compute s_sr1.

The third privacy issue (mixing the find-receive and generate-address tier) is fixed because of the same reason: a third-party must now know both k_dv and k_sv to compute s_sr1. However, there was a deeper issue here with the old scheme since third-parties who knew k_fr, k_ga, and K_s (the combination of find-receive and generate-address tiers) could decipher the address indicies and recompute the onetime address Ko, proving to themselves that a user owns this enote with 100% certainty (assuming the user did not lose their keys). Since this issue is addressed, this opens up the possibility for Venmo-like applications where s_ga, k_dv, and K_s are given to a single third-party so that the third-party can reduce users' refresh times by ~99.6% using view tags and generate receive addresses for their users' while they are offline without compromising privacy.

How the New Changes Affect Scanning Speed

For regular full wallets where both k_dv and k_sv are known, the first view tag check can be against the 2-byte sparse view tag to initially filter out all but 1:65536 enotes with just one DH exchange, as compared to 1:256. After that, the dense view tag can be checked to further refine the enotes in a 1:256 ratio. For owned enotes, the balance recovery process is actually slower since 3 DH operations are needed instead of 2. As for 1-byte-view-tag light wallets (Old "find-receive" tier and new "dense-view" tier), the server does the exact same amount of work (1 DH + view tag check), but the client will need to do expend more CPU cycles, assuming that a DH exchange is more expensive than symmetrically deciphering the 16 bytes address index.

Below, I have provided and quick and dirty comparison of the operations that must be done to scan enotes under different wallet types. I use the term "Sparse-View Light Wallet" and "Dense-View Light Wallet" to refer to wallet schemes in which the sparse (2 byte) view tag key k_sv and dense (1 byte) view tag key k_dv are provided to third parties, respectively, to initially filter out enotes. The new "Dense-View" wallet tier is the most similar to the old "Find-Received" wallet tier in the respect that they can both calculate 1 byte view tags on behalf of users.

Normal Enote Operation Density

Amortized Period Old Full/Light Wallet New Full/Sparse-View Light Wallet New Dense-View Light Wallet
1:1 enotes DH + view tag check* DH + view tag check* DH + view tag check*
1:28 enotes 16 byte decipher + decipher hint check DH + view tag check
1:216 enotes DH + view tag check
1:224 enotes Ko recompute 16 byte decipher + Ko recompute 16 byte decipher + Ko recompute

* = If applicable, a light wallet server would perform this operation on behalf of a user in the background. This is important when considering trade-offs because if you value client scanning time above all else, you can disregard the operations marked by an asterisk when considering light wallet schemes.

Total Notable Operations for a Owned Normal Enote

Scheme Total Notable Operations for a Owned Normal Enote
Old DH + view tag check + 16 byte decipher + decipher hint check + Ko recompute
New 2 * (DH + view tag check) + 16 byte decipher + Ko recompute

There are obviously more operations in the balance recovery than are mentioned here, but these are likely the most expensive. The main performance difference between full wallets is that every 256 enotes, the old scheme has to decrypt the address tag, decipher the address index j. The new scheme only does this every 1:16777216 enotes, but must perform an extra Diffie-Helman key exchange and view tag check once every 65536 enotes. According to @tevador, DH exchanges are ~100x more expensive than deciphering, so the scanning performance will likely remain more or less the same for full wallets.

On the other hand, the performance for light wallet clients is worse. The work for the server is exactly the same: 1 DH + view tag check, but the client must do 1 DH + view tag check instead of a 16 byte decipher every 1:256 enotes on-chain (every enote the client receives). At any rate, in both the new "dense-view" light wallet and old "find-receive" wallet tiers, the client must download ~65536x more information (less if the user owns a large fraction of on-chain enotes) than is actually needed for balancy recovery past view tag/decipher hint checks, so the performance difference here is hard to quantify without real-word testing. It should be noted that the new "sparse-view" wallet tier follows the same recovery path as the full wallet, so it gets the performance benefit on both the server and client of first being able to check against the sparse view tag, filtering out all but 1:65536 enotes for a user as compared to the normal 1:256. This means that a "sparse-view" light wallet client has to download 256x less information than current Jamtis light wallets, obviously at the privacy cost of narrowing down owned enotes probabilistically to 1:65536.

Additional Opinions on Why I Think the Trade-off is Worth it

Gathering from years of forum discussions and IRC/Matrix chats, one of the biggest UX complaints (arguably only beaten by the 10-block-lock) against Monero is the frustratingly long refresh times. This is such an issue that light wallet ecosystems evolved in the very early days of Monero to tackle this problem. The users of these light wallets were willing to completely sacrifice their incoming enote privacy (by revealing private view keys) just to bring refresh times down. There are innumerable posts online about potential users who left Monero completely because of refresh bugs and the corresponding wait times. This is why privacy-preserving light wallet servers are the future for most casual users, and will capture many on-the-fence people who want a better privacy/UX balance. Creating accessible, un-foot-gun-able digital cash is the core value proposition of Monero for me.

The new light wallet scheme under Jamtis is exciting and brings and lot of possibility. However, the privacy issues inherent to them would make it hard for me to recommend to anyone except the least privacy-minded people. There are simply too many ways to footgun, the main concerns being over the passive address tag decrypting issues, which means you can't receive to the same address more than once or let your light wallet server know your public addresses.

Addressing the main downside of this change, the address size, I say: its not that big of a deal to me. Jamtis addresses are already >3x the size of BTC addresses, so increasing the size by ~25% doesn't matter. The new addresses would still be easily copy-and-paste-able and fit on a medium QR code. I don't know anyone who is typing the addresses out by hand or reading them aloud, even with legacy Cryptonote addresses which are >2x the length of a BTC address, so I don't believe that use case is affected. For those that have read this far, thank you for your time and consideration. ;)

@tevador
Copy link
Author

tevador commented Aug 19, 2023

Public address size is increased by 30 bytes

The actual address length would increase from 196 to 244 characters.

As you can see, unless deciphering is more than 256x faster than a DH operation

DH is about 100x slower. The performance impact of this change is likely negligible (slightly slower overall).

Nevertheless, the privacy benefits might still be worth it.

@One-horse-wagon
Copy link

To enhance security by accommodating your new protocol in a 244 character address is a no-brainer to me. Address length would become an issue only if it would limit what you can do, such as making Q-R codes unusable.

@jeffro256
Copy link

DH is about 100x slower.

I assume you're talking about X25519 and Twofish here, is that correct? If we move to a curve cycle to prepare for FCMPs, how fast can DH/variable-base-scalar-multiplication be made using your curve cycle? I would assume that it would be slower, so the full wallet scanning performance changes would likely wash out completely.

@tevador
Copy link
Author

tevador commented Aug 20, 2023

I assume you're talking about X25519 and Twofish here, is that correct?

Correct.

how fast can DH/variable-base-scalar-multiplication be made using your curve cycle?

X25519 has many optimized implementations that would be very hard to beat with a custom curve.

If we switch to the curve cycle, this only affects the "proof" keys (denoted with the letter K in this specification). I strongly recommend to keep X25519 for the key exchange keys (denoted with the letter D in this specification). This should be easy to do because Jamtis never needs any interop between the key exchange keys and the "proof" keys, so these can be completely unrelated elliptic curve groups.

@j-berman
Copy link

I lean yes on the idea to add an additional pub key to the address for the privacy gain to light wallet users, however, I'm not a hard yes and I think it's an acceptable decision to proceed without it. I'm going to steel man an opposing argument: even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis. As such, the addition offers a benefit that light wallet users using 3rd party servers shouldn't consider in their threat model, and therefore is not worth the added UX and complexity burden.

I lean yes (and do not agree with the steel man) because the address length would still fall within an "acceptable" size, and the proposal offers a tangible privacy benefit to light wallet users (and therefore benefits the anonymity set): the server cannot definitively identify a user's received enotes even if the user receives to the same address twice or if the server knows the user's address, which is a strict improvement to a light wallet user's privacy even if there are still potential statistical leaks under certain conditions.

I'll explain why I think a 3rd party server may still be able to trace transactions using statistical analysis under certain conditions.

Assume this proposal is accepted alongside full chain membership proofs. After some discussion with @kayabaNerve, here is what I understand the theoretically optimal privacy profile for light wallets could look like when constructing a tx:

  1. The user opens their light wallet client and requests their wallet's view-tag-matched enotes from the server.
  2. In order to construct a tx, a light wallet client fetches paths in the merkle tree to a set of enotes (1 real path + N decoy paths so whomever is serving the paths does not know which enote the user is spending)1
    • Each single path would be on the order of kilobytes, thus the light wallet client would fetch a subset of paths similar to fetching decoys today (using a decoy selection algo).
    • The light wallet client should request these paths from a 3rd party daemon whose operator is ideally not colluding with the light wallet server. This way the user avoids revealing to the light wallet server that the user is trying to construct a tx.
      • The light wallet client could request paths to view-tag-matched enotes only, just in case the 3rd party daemon is colluding with the light wallet server.
    • The light wallet client should also request fees from and submit the final tx to 1 or more 3rd party daemons ideally not colluding with the light wallet server to avoid revealing the tx was constructed by the user to the server.
  3. Finally, the user's tx will include a view tag match on chain.

If you assume the 3rd party daemon is not colluding with the light wallet server, then the statistical footprint is: user opened their light wallet, shortly thereafter there's a view tag match on chain. This footprint's impact on a user's privacy depends entirely on tx volume. With low volume, the server is able to tell the user likely spent an enote in the tx since the view tag match is likely change. If the server collects these footprints for every tx the user constructs, with low volume, the server can perhaps start to build a user's plausible tx graph.

If you assume the 3rd party daemon is colluding with the light wallet server, which I think should be every user's default assumption (trusted 3rd parties are security holes), then the statistical footprint naturally can have a worse impact on a user's privacy. The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends. Example:

  • The user receives an enoteA in txA, then spends that enoteA in txB and has a change enoteB in txB. The light wallet server knows the user constructed txB and therefore knows the view tag matched enoteB in txB is likely the user's.
  • When spending enoteB in txC, the user requests a set of merkle paths where enoteB is 1 of N path requests.
  • The light wallet server knows the user constructed txC and can make an educated guess that enoteB was spent in txC.

The light wallet server has thus built up evidence the user received enoteB in txB and spent enoteB in txC.

This statistical leak should be considered unavoidable for the light wallet tier imo; this leak can only be mitigated in some capacity. Which is why I would hope that light wallets don't replace full wallets for privacy-conscious users unless they're running their own light wallet servers. I would still argue the single additional key "find-received" tier as currently spec'd is valuable and worth implementing because 1) amounts are unknown to the server (significant privacy benefit), and 2) it offers a tangible privacy benefit since the light wallet server cannot definitively identify all of a user's received enotes under all conditions. But I can understand the argument why two additional keys for the tier is excessive considering the above argument.

Reiterating: I'm still for the proposal to add an additional pub key to the address. I think the tangible privacy benefit the additional key brings to light wallet users is worth ~25% larger addresses and more complexity. But I don't hold a strong yes considering I think the argument against is a strong argument.


I haven't dug deeper into the sparse/dense view side of the proposal yet and will comment on that later.


1: requesting paths in the merkle tree would be unnecessary if the client downloads the entire merkle tree when scanning. However, this downloading could be on the order of gb's, which would then defeat the core benefit of a light wallet: instant wallet open.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023

The Merkle tree leaves would be 32 bytes per output, or a few GB @ 100m outputs. If we have branches with no view-tag-matched outputs, they can be dropped for one 32 byte value. If the view tag hit rate is 1/256, I believe more than half of the branches will have at least one leaf. If the view tag hit rate is 1/65536, most won't.

(branch length is currently configured to 167)

If we have a 1:65536 hit rate, only ~1/400 branches will be hit? That means the 3.2 GB leaf set at 100m outputs becomes 10 MB? It seems much saner to just download the tree in this case.

@j-berman
Copy link

j-berman commented Aug 22, 2023

Tx volume is hovering around ~20k txs per day these days, which is a floor of ~40k outputs per day. Let's assume ~65k outputs per day, which is an expected ~1 view tag match per day at a 1:65,536 hit rate. At that rate, any view-tag-matched enotes the server identifies around the time a user opens their wallet would almost certainly be the user's enotes. Further, any clusters of enotes the user spends/receives in a single day would stick out like a sore thumb to the server.

Seems at that hit rate and today's volume, the privacy gain of view tags is close to nil.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023 via email

@DangerousFreedom1984
Copy link

  • Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?
  • What is roughly the rate of people that use third-party servers to filter enotes for them?
  • If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)
  • Giving away your sparse and dense priv keys is the same as giving away the priv find_receive key in the original seraphis, right?
  • (Just a thought) Differently from Bloom filters in Bitcoin that in reality dont really enhance privacy, I believe that these changes would enhance privacy here due to the different layers of privacy that Monero already has. But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.
  • It would have been really hard to make these changes if Seraphis were already in use as they are huge. I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it. But I like also the idea of increasing the address for those who want that feature with more privacy. Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that? Maybe a good exercise to think about :p
  • I am willing to make the necessary changes in the knowledge proofs if these changes pass.
  • I'm still in the process of understanding and trying to answer these questions that I have so I don't have an opinion now but the efforts are very much appreciated. Thank you!

@UkoeHB
Copy link

UkoeHB commented Aug 28, 2023

@jeffro256, here is my review of the proposed changes to the document. I will follow-up with an assessment of pros/cons in a later comment.

To summarize the proposal: Do two key derivations instead of one during the 'view tag filtering' piece of balance recovery. If one derivation is offloaded to a third party, then the second derivation gates access to the nominal address tag (and nominal address spend key).

  • deriving s_fr from k_ua

    • It would be better to derive s_fr from k_vb. That way k_dv and k_sv will have the same entropy as k_ua.
  • section 8.2.4 'Optimized Design'

    • Normal enotes: It should be 'three ECDH exchanges'. Also, adding an additional 32 bytes to s^sr_1 means you'll need two blake2b blocks instead of one (a block is 128 bytes, and iirc we only need one block for s^sr_1 currently), so it is technically four hash operations for normal enote secrets.
  • section 8.3.3

    • Formatting is messed up.
  • section 8.3.4 (needs proof-reading)

    • "we include a MAC-like hashes" -> "we include MAC-like hashes"
    • "and check it against" -> "and check them against"
    • "the ECDH exchange" -> "the ECDH exchanges"
    • start a quote with back ticks so they curl properly: ``
    • "ensuring the view tag derivation" -> "ensuring the view tag derivations"
    • Tentative rewrite: "We highlight the advantage of using two view tags, rather than one, in Section 8.5.1".
  • section 8.4.3

    • K_1 -> K_s
  • section 8.5.1

    • Revert section title changes.
    • "by checking view tags" -> "by checking its view tags"
    • "since it tends to be larger, and thus filters out more computation" -> This is introduced with no prior discussion about the recommended size of view tags (other than vaguely implied by the view tag names).
  • section 8.5.2

    • Self-send tau checks are no longer cheap, because there is no longer an address tag hint.
  • Comments

    • I am not entirely in agreement with rolling back the 'address tag' term. I think it is easier to handle than 'ciphered address index'.
    • Considering the self-send tau check issue, it would be better to just retain the address tag hint instead of adding in a separate view tag. (EDIT: the perf diff here is probably non-existent, so I retract this comment)

@jeffro256
Copy link

@DangerousFreedom1984

Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?

Honestly, this is really hard to say. I wanted to say that normal full wallet scanning was not going to be any slower than before, but @UkoeHB brought up an issue with the self-send tau checks (I haven't looked into it yet).

What is roughly the rate of people that use third-party servers to filter enotes for them?

I think the rate of people using light wallet servers now is very low because of the terrible privacy trade-offs (giving away your private view key). Fixing some of the privacy issues with light wallet servers and advertising those changes amongst the greater community would surely affect the usage rate.

If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)

If it really was this low then I don't know if the trade-off would be worth it. I suspect it won't be this low, though. Just look at (e.g.) MyMonero downloads vs other apps.

But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.

Unfortunately, unless some other technique is used to transmit chain data, the fewer enotes/txs the light wallet server associates with you, the smaller your anonymity set is, which means less bandwidth = less private. Idk how to solve that yet either.

I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it

It doesn't really affect any part of Seraphis proper, just the Jamtis addressing layer, and just normal enote balance recovery at that. So it does require basically a complete rewrite of balance recovery code, but shouldn't actually expand it too much hopefully (working on that right now).

Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that?

Two problems with making these changes an optional add-on is that 1) you partition yourself to senders by giving away information about your type of wallet and 2) ecosystem developers now have to support 2 types of addresses, and you can see how well that ends up normally (e.g. current light wallets still don't support sub-addresses). I (and I'm sure others) would prefer if there was just one type of address, but it certainly could be done.

I am willing to make the necessary changes in the knowledge proofs if these changes pass

Thank you, I really appreciate it ;)

@jeffro256
Copy link

@j-berman Thank you for your deep analysis and counter-arguments

To confront the initial steelman:

even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis.

Since Monero's conception, the network has never provided perfect privacy, only plausible deniability. There is over 9 years of on-chain data to perform statistical analysis upon, but never has the protocol been designed to allow deterministic de-anonymization. The implications of the ability for light wallet server to more-or-less ~100% deterministically sense that a user owns an incoming payment in conditions outside of the user's control (public address sharing, multiple receives) are massive, especially in the western legal domain. Downgrading these attacks to statistical, especially where the risks decrease with greater transaction flow, may save people from legal battles in the future.

I agree with almost everything else, although I don't think we should completely consider the statement true in all cases:

The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends

This is a current design choice that is made for light wallet servers because of the convenience, but nothing about the Cryotonote protocol or Seraphis/Jamtis protocol requires this to be true. It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address. In this case, the user doesn't have to assume that the 3rd party daemon isn't colluding with the light wallet server, it knows it to be true, aside from a Sybil/Eclipse attack.

@j-berman
Copy link

j-berman commented Aug 28, 2023

I'd say my commentary is most relevant toward understanding why a 2 byte view tag would offer basically no privacy advantage at today's tx volume due to its statistical surface, even with Tor and with connecting to 3rd party daemons to submit txs: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024?permalink_comment_id=4668705#gistcomment-4668705

I generally agree the idea to add an additional pub key does provide a stronger level of privacy though, which is the primary reason why I'm a proponent of the idea. I agree that when compared to the Jamtis light wallet tier without the additional pub key, this proposal downgrades the statistical attack surface (and the surface could become virtually non-existent with extremely high tx volume).

Still, it's worth keeping in mind that the statistical analysis surface the light wallet tier brings is more significant than Monero's current full wallets.

A while back someone proposed that full wallets only download data necessary to determine which outputs belong to a user, and then once identified, request the transactions of those outputs along with "chaff" (decoy) transactions, in order to minimize data needed to download when scanning. There was pushback on this idea because of the widened statistical surface enabling a node to potentially pinpoint a user's txs: https://www.reddit.com/r/Monero/comments/5wc2th/a_proposal_to_speed_up_wallet_sync_around_5x/de940mj/

It's worth keeping in mind the light wallet tier introduces a similar surface.

It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address.

This is what I was getting at in explaining how the optimal privacy profile of a light wallet client would communicate with a 3rd party daemon ideally not colluding with the server. Even with Tor though, if a 3rd party daemon combines logs with a light wallet server, the logs would show e.g. Bob just opened his light wallet client, then 1 person just requested paths in a merkle tree (1 path included one of Bob's view tag matched enotes)/fees/submitted a tx to the network, and Bob has a view tag match in that tx.

Unless there exists significant cover volume where tons of people are trying to construct txs at a specific point in time, then it's fairly trivial to guess Bob's tx, his spent enote, and his change enote.

However, yes, it's still a "guess" which I agree is stronger privacy than the current Jamtis light wallet tier's "100% certainty in some cases" and would improve with higher tx volume.

@jeffro256
Copy link

@UkoeHB I've been thinking about the slowness of the self-send tau checks under the new addressing scheme, and yes you are right, they are slower since there are no address tag hints. However, since you can now do 3-bytes of view tag checks BEFORE doing the self-send tau checks vs 1-byte of view tags checks, under the the new scheme, the process of self-send tau checks will be done ~65536 times less (more often if one's self-sends is a larger portion of total on-chain enote volume). Hopefully, this amortizes out to be slightly faster overall for most users.

@kayabaNerve
Copy link

Too many view tag bytes hurts privacy AFAIUI, @j-berman to properly state what I'm thinking of so we're all on the same page.

@jeffro256
Copy link

To be clear, I say 3 bytes of view tags, but it is split into two view tags, a 1-byte and a 2-byte tag, which are each computed from two independent DH secrets. You can give access to compute just one view tag (presumably the 1-byte view tag) to a light wallet server. However, if you are the client with the whole view-balance key, you can compute both view tags and check against both before trying self-send tau checks.

@jeffro256
Copy link

@j-berman Was making the point that without huge increases to transaction volume and the assumption that the third-party daemon and light wallet server are not colluding, the privacy of giving a light wallet server the ability to compute 2-byte view tags is very bad.

@kayabaNerve
Copy link

Ah, sorry. Thanks for clarifying.

@jeffro256
Copy link

jeffro256 commented Sep 10, 2023

For the base32 encoding, instead of using a custom alphabet, why not use an existing standard that meets our requirements like Crockford base32? Spec here: https://www.crockford.com/base32.html. There's an existing C++ implementation here: https://github.com/tplgy/cppcodec/blob/master/cppcodec/base32_crockford.hpp.

@UkoeHB
Copy link

UkoeHB commented Sep 10, 2023

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send). The combined tier would be an ultra-efficient scan tier with high visibility into user transaction graphs. I expect that in the long run, someone will implement that tier to the detriment of user privacy.

So the trade-off is: A) improve privacy for the recommended remote scanning tier, B) expose an unrecommended remote scanning tier that is materially superior to the recommended tier and greatly weakens user privacy.

@jeffro256
Copy link

Tbf, this was already possible by combining Find-Received + Cipher-Tag. You could give a third-party s_ct and k_fr, and then they could decrypt and decipher address tags, whittling down the probability that a scanned enote is a false negative to 1:16777216.

@UkoeHB
Copy link

UkoeHB commented Sep 11, 2023

Tbf, this was already possible by combining Find-Received + Cipher-Tag.

Not quite. With k_fr and s_ct you can only identify normal enotes. You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal. This is actually much better overall, because now it is feasible for someone to offload both k_fr and k_rs to a remote scanner in order to offload computation of the second key derivation to that scanner (in your proposal it would not be feasible due to the self-send identification issue), which may be a beneficial trade-off if tx volume becomes very large (e.g. if tx volume increases 256x, then your proposal would leave light wallet clients with the same scanning perf normal clients have today).

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal.

I like this solution. The cost would be slightly longer addresses (247 vs 244 characters), but there would be much stronger protection of self-sends from the remote scanning services. See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

If tx volume increases 256x, we'd be at ~40 MB blocks with a blockchain growth of >10 TB/year. If the network can handle that, I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags.

@jeffro256
Copy link

Not quite. With k_fr and s_ct you can only identify normal enotes.

Fair enough

You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

If the user is this hell-bent on revealing their transaction graph for the sake of efficiency, why doesn't the user also send his self-send TXIDs to the light wallet server? IIRC, current light wallet servers already know which users are tied to which outgoing transactions by virtue of helping them construct that transaction. Heck, all of these changes still don't keep the user from sending their view balance key, which would constitute the most efficient light wallet server. If they wanted to dance around the fact that this isn't private, they could even add some ad-hoc tech to randomly request other data so they can claim its private, or an infinite amount of other things that degrade privacy but make it more efficient. To me, this argument falls under the same category of criticism at the announcement of view-balance keys, because someone else could force them to reveal their view balance keys. It isn't cryptographically possible to prevent people from revealing secret keys willy-nilly, so I don't know how productive it is to talk about potential future scenarios in which the tier system is willingly abused. What we should design are the tiers that we want to see, because users will use them and gain certain trade-offs, while minimizing risk to the planned tiers.

but there would be much stronger protection of self-sends from the remote scanning services

Same point here: It isn't stronger if we don't assume the user won't abuse the wallet tiers, which is what brought this discussion on.

See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

I agree that hiding self-sends is important, but unless you have a protocol that forces users' self-send privacy, I think that point is moot here.

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send).

One thing about prevents this using actual incentives is the existence of the the 2-byte view tag "sparse" tier in the original proposal. 2-bytes of view tag, for people like us, is complete overkill in efficiency/privacy balance as of current tx volume. But potentially in the future, if there are users who don't want to even scan 1/256 of the enotes on the chain, because they value convenience over privacy 10-fold, they can scan 256x times less than that: 1/65536 (about ~1 enote every day on mainnet today). I think it's not unreasonable that tx volume could 256x sometime in the distant future, which would mean that an enote hit every 10 minutes or so for people using the 2-byte view tag tier. (@j-berman did a great analysis of timing attacks against 2-byte view tags against current tx volume in this thread)

But here's the big thing: this tier doesn't have the deterministic drawbacks of a third-party wallet knowing your nominal address tags: identifying incoming normal enotes to known addresses and incoming normal enotes sent to addresses more than once with ~100% certainty. The privacy of the 2-byte view tag tier scales up with volume, and it is much more detrimental to privacy than the proposed "dense" view tag tier, but if we're planning for very desperate users like we're doing here, we need a bigger jump for light wallet scanning than replacing DH ops with Twofish ops; we need to have the option to cut bandwidth without deterministic attacks.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags

If it's safe to assume this, then why have the modifications in the first place? If it's so easy to keep up with bandwidth and computation, why would users feel the need to jump ship to worse privacy trade-offs en masse?

@tevador
Copy link
Author

tevador commented Sep 12, 2023

why doesn't the user also send his self-send TXIDs to the light wallet server

Rational users have exactly zero incentive to do this.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

Do we really need two view tags for this from the start? Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

@UkoeHB
Copy link

UkoeHB commented Sep 12, 2023

Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

This can be abused by a malicious remote scanning service to reduce the anonymity of users by spamming the chain.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

Malicious actors can reduce the anonymity of users by spamming the chain right now.

@UkoeHB
Copy link

UkoeHB commented Sep 12, 2023

Yes but a dynamic view tag would make spam more damaging.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

The options are:

  1. Fixed-size view tag: Either good plausible deniability now and possibly inadequate filtering later, or vice versa (or somewhere in between).
  2. Multiple view tags of various sizes: coarse tuning of the false positive rate; susceptible to spam attacks (an attacker can spam for a while to make users subscribe with the larger tag); retroactive privacy loss when switching to a larger tag.
  3. Dynamic-size view tag: fine tuning of the false positive rate, no retroactive privacy loss, susceptible to spam attacks.

Choose your poison.

@jeffro256
Copy link

The fourth option, which is what @UkoeHB was proposing, is a fixed-size view tag but optionally enable third parties to compute nominal address tags, which reduces light-side single-core compute time by about 100x, but increases the bandwidth by ~10%.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

optionally enable third parties to compute nominal address tags

This is orthogonal, can be added to any of the above 3 options. The important point is that it does not reduce the bandwidth requirements for light clients.

@jeffro256
Copy link

jeffro256 commented Sep 13, 2023

I don't know if this idea has ever been floated before, and I'm making this up right now, but we could do dynamic view tags that 1) aren't susceptible to spam attacks, 2) scale as the receiver wishes and 3) all look uniform on-chain while keeping transaction size the same. They would have an absolute maximum size set by consensus (say 2-bytes, or 3-bytes if we're pushing it, but 2-bytes is probably fine for a maximum). All tags on-chain will show up as this constant length. Let's call this number of bits b_max. The actual length of view tag/the amount of filtering that a receiver desires is encoded in the address. Let's call this value b_addr. We shouldn't give users too many options, else they will partition themselves too finely and addresses could be attempted to be correlated. Let's say that we give the users 8 (could be any number and 4 might be better) choices, which means the size of integer b_addr is fixed at 3 bits (which we could actually fit into an address in my proposal without expanding the address size since there's 4 unused bits). Let's say that our 8 choices for b_addr (the utilized bit length of the view-assist tag) are 1, 2, 4, 6, 8, 10, 12, or 16 bits wide. The higher b_addr is, the more efficient scanning is, but the smaller the anonymity pool is. Full wallets would likely set this value as high as it will go (since they lose no privacy either way if they are giving up that private key). Light wallets would select a good value for them, then send k_va (view-assist) and b_addr to their light wallet server.

Senders, when sending to an address, will extract b_addr from the address and encode b_addr number of bits into the view-assist tag, and whatever bits are left in the view tag space (b_max - b_addr), they will fill with random noise (this part is important to not miss otherwise we might accidentally filter using more bits than intended).

Light wallet servers, who know b_addr for each user, when doing DH exchanges against k_va, will match b_addr bits and send those records to the light wallet client, who scans them as usual.

Cons: 1) Partitioning on receive addresses can happen. 2) Malicious senders can use more bits than requested (b_addr) to signal probabilistically to someone's light wallet server that an incoming normal enote belongs to said receiver. 3) If a receiver creates a recieve address with b_addr1, then gives that address to a sender, then wants to increase b_addr1 to b_addr2>b_addr1, the receiver might not properly scan a transaction sent with the old b_addr1, and will need the sender to tell the receiver the transaction ID. Not too worried about point 2 since its already possible to construct a transaction then blab about it. Point 1 is a little trickier.

Pros: 1) Many options for users so they are incentivized to not completely bomb their privacy even with incredibly high transaction volume and bad connectivity, 2) on-chain uniformity, 3) view tags not susceptible to spam attacks 4) no retroactive loss of privacy when increasing b_addr (moving to less private, more efficient tier) 5) There are options to do less than 1:256 view filter, e.g. 1:64 filtering.

What do you think?

@jeffro256
Copy link

In the context of a Jamtis protocol where we do 2 DH operations for normal enotes anyways (to guard the sender-receiver secret from light wallet servers), all txs could also still include the independent fixed-size view tag for the second key (like the "sparse view tag" in my original proposal). Full wallets could use this fixed-size view tag as they first tag they scan against, allowing them to generate random values of b_addr for their addresses to help mitigate partitioning, while not affecting their scan time by more than fractions of a percent.

@tevador
Copy link
Author

tevador commented Sep 13, 2023

Interesting idea, but:

  1. Users of remote scanning services might be coerced or tricked into using the longest possible tag, getting zero privacy, while costing the malicious service nothing.
  2. Any address with fewer than the maximum number of bits would immediately leak the fact that the user is using a remote scanner.
  3. Horrible UX when changing the tag size. Payments sent to old addresses would not be recognized and asking the sender for the TXID is not always possible (e.g. donations).

Compare that with a dynamic tag size calculated from the running average tx volume over the last 100 000 blocks so that the mean false positive rate is about 256 tag matches per day:

  1. Malicious services would have to spam constantly at least 50% of the transaction volume to add 1 bit to the tag size (reducing the effective false positive rate to 128 matches per day). This attack is not free, the attacker is paying transaction fees.
  2. Neither addresses nor transactions leak anything.
  3. No UX problems because there is always agreement about the tag size that was used in a transaction.

@tevador
Copy link
Author

tevador commented Sep 13, 2023

Full wallets could use this fixed-size view tag as they first tag they scan against, allowing them to generate random values of b_addr for their addresses to help mitigate partitioning, while not affecting their scan time by more than fractions of a percent.

You have to look at it from a game-theoretic perspective. If users can make a choice that improves their experience regardless of what other users do, you have to assume they will make that choice (see prisoner's dilemma). Full wallet users will use the full tag size if they can speed up their scan time by a fraction of a percent.

@jeffro256
Copy link

Yeah the downsides are kinda weird and hard to reason about/plan OPSEC for. If there was a way to allow someone to encode a certain level of entropy to be received by someone else without the sender knowing what the level of entropy is, I think that'd be the way to go, but I don't know if that's possible.

@jeffro256
Copy link

I'm much less sure than before, but I still think that a mix of a 1-byte and a 2-byte fixed-size view tag is the best option (assuming we're doing 2 DH options to get q). I think we should put a ton of effort into plan A, making sure the scanning compute process is as optimized as possible, and uses as much processors as available for any given machine. The performance tests show that even doing 1 DH op (instead of 1 Twofish op) for every received record on the light wallet client-side keeps up with modern, middle-of-the-road bandwidth speed, even using just a single core.

The fact that the view tags options are so coarse might hopefully incentivize us developers in the future to push hard for plan A for light wallet users so that they don't switch to option B, the 2-byte view tags, but if they do, at least they won't have deterministic downsides and will more-or-less know what they're getting into: 256x smaller anonymity set.

@tevador
Copy link
Author

tevador commented Sep 14, 2023

I don't think we should allow users to select the view tag size. There should be only one view tag and I'm more in favor of the dynamic size as it's a more future-proof solution and I'm not convinced that spam attacks are a real problem compared to the alternatives.

I'm proposing the following:

Jamtis with dynamic view tags

Pros

  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes to known public addresses. (same as the proposal by @jeffro256)
  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes sent to a public address that is used more than once. (same as the proposal by @jeffro256)
  • Third-parties can now compute view tags and generate public addresses on behalf of users without the ability to learn any additonal balance recovery information. (same as the proposal by @jeffro256)
  • Light wallets have a fixed bandwidth (about 200KB/day) and CPU (about 100 ms/day) cost regardless of the transaction volume. These costs are so low that no third party provider should be able to successfully argue for users to hand over higher tier private keys.
  • Users cannot shoot themselves in the foot by selecting a view tag size that doesn't have enough false-positive matches.

Cons

  • Public address length is increased from 196 to 244 characters. (same as the proposal by @jeffro256)
  • Third-parties who compute view tags on behalf of users can spam the network to reduce the effective number of false-positive matches of their users.
  • Additional ~40 ms of CPU time per day for users who scan the blockchain locally, but this is negligible.
  • Additional complexity in the specs

Changes

Private keys and wallet tiers

The number of private keys stays the same, but some keys have a different function and have been renamed:

  • d_ua "unlock-amounts" -> d_vr "view-received"
  • d_fr "find-received" -> "filter-received"
k_m (master key)
 |
 |
 |
 +- k_vb (view-balance key)
     |
     |
     |
     +- d_vr (view-received key)
         |
         |
         |
         +- d_fr (filter-received key)
         |
         |
         |
         +- s_ga (generate-address secret)
             |
             |
             |
             +- s_ct (cipher-tag secret)

This cleanly maps to the supported wallet tiers:

Tier Knowledge Off-chain capabilities On-chain capabilities
Master k_m all all
ViewBalance k_vb all view all
ViewReceived d_vr all view all received except of change and self-spends
FilterReceived d_fr recognize all public wallet addresses calculate view tags
GenAddr s_ga generate public addresses none

GenAddr + FilterReceived can be safely combined. The key hierarchy ensures that no additional tiers can be constructed.

Addresses

Addresses consist of 4 public keys:

  1. K^j_1 = K_s + k^j_u U + k^j_x X + k^j_g G (unchanged)
  2. D^j_2 = (1 / d^j_a) * d_fr * B
  3. D^j_3 = (1 / d^j_a) * d_vr * B
  4. D^j_4 = (1 / d^j_a) * B

B is the Curve25519 base point. Note the inverted usage of d^j_a, which simplifies enote recovery.

There is no tag hint, so only j' = BlockEnc(s_ct, j) is part of the address. The total address length in base32 is 244 characters including the prefix and checksum.

Key exchange

The sender generates an ephemeral private key d_e and calculates D_e = d_e * D^j_4.

Shared secrets

There are 3 DH shared secrets:

  1. DH_1 = d_e * D^j_2 = d_fr * D_e
  2. DH_2 = d_e * D^j_3 = d_vr * D_e
  3. DH_3 = d_e * B = d^j_a * D_e
  • DH_1 is used to calculate the view tag.
  • DH_2 is used to derive the first high-level shared secret: s^sr_1 = H(DH_2 || D_e || input_context)
  • DH_3 is used to derive the second high-level shared secret: s^sr_2 = H(DH_3)

Self-send enotes use a different construction for the high-level secrets (unchanged).

View tags

The view tag is calculated by hashing DH_1 together with K_o (both for normal and self-send enotes).

View tag filter target

The view tag size is dynamic and is automatically adjusted based on the transaction volume so that the false positive rate (the number of view tag matches) is 480 enotes/day. Because the view tag filter rate must be a power of 2, this will actually result in a range from 480 to 960 enotes per day depending on the tx volume. If we "average the averages" over all possible values of tx volume, this will give a mean of 720 enote matches per day, or roughly 1 match per block, which is what was suggested by @jeffro256. I think this is close to the upper limit of what is acceptable for light wallet clients (~200 KB/day) and should provide a good number of false positives even if there was a short term drop in tx volume.

The fomula to calculate the view tag size in bits is:

tag_size = trunc(log2(3 * num_outputs_100k / 200000))

where num_outputs_100k is the total number of outputs in the last 100 000 blocks. The trunc(log2(x)) function can be easily calculated using only integer operations (it's basically the position of the most significant bit).

As an example, the value of num_outputs_100k is currently about 7.9 million, which results in a view tag size of 6 bits when plugged into the formula. With around 56000 daily outputs, there will be about 880 matches per day. If the long-term daily volume increases to about 62000 ouputs, the view tag size will be increased to 7 bits and the number of matches will drop to 480 per day.

View tag size encoding

The view tag size must be encoded explicitly to avoid UX issues with missed transactions at times when the view tag size changes. This can be done with a 1-byte field per transaction (all outputs will use the same tag size).

I'm proposing a range of valid values for the tag size between 1 and 16 bits.

A 1-bit view tag requires num_outputs_100k > 133333. Since there are always at least 100k coinbase outputs, the 1-bit view tag would be "too large" only if there were fewer than 120 transactions per day, which hasn't happened on mainnet except for a few weeks shortly after launch in 2014.

A 17-bit view tag that would overflow the supported range would require num_outputs_100k > 8738133333, an increase of more than 1000x over the current tx volume. If this somehow happened, the number of false positives would exceed 960 per day, which would only have performance implications for light wallets, but would not cause any privacy problems.

So the proposed range of 1-16 bits is sufficient.

Complementary view tag

Regardless of the tag_size, the view tag is always encoded in 2 bytes as a 16-bit integer per enote. The remaining bits are filled with a "complementary" view tag calculated from s^sr_1, which needs a different private key.

For example, with tag_size = 6, the 16 bits would be CCCCCCCCCCTTTTTT, where T is a view tag bit and C is a complementary view tag bit.

Third-party scanning

The intended use is to provide d_fr to a third party, who can then calculate the "T" bits of the view tag and filter out non-matching enotes. There will always be a sufficient number of false positives so that the third party cannot learn with certainty which enotes are owned by the user. The light wallet can then calculate the "C" bits and further filter out enotes. On average, the light wallet will need to recompute K_o for 1 enote out of 65536.

Users might be tempted to provide the view-received key d_vr to the third party to speed up scanning. However, this does not save any bandwidth in practice because the server can't calculate s^sr_1 for self-send enotes. It only saves a minuscule amount of CPU time (~100 ms/day at best) in exchange for a loss of privacy for all incoming payments (including amounts).

Similarly, users might be tempted to provide the view-balance key k_vb to the third party to speed up scanning. This would save about 200 KB/day in exchange for a complete loss of privacy.

These unintended use cases are sufficiently unfavorable to restrict third party scanners to the FilterReceived wallet tier.

Scanning speed

The following table shows the cryptographic operations needed to recognize owned enotes for different types of wallets (assuming the wallet does not receive more than a few payments per day). I'm ignoring symmetric crypto operations for simplicity (they are negligible).

Wallet type For each enote For ~720 enotes/day For 1/65536 enotes
Full wallet (ViewBalance) 1x DH 1x DH 3x recompute K_o
Full wallet (ViewReceived) 1x DH 1x DH 1x recompute K_o
Light wallet (ViewBalance) - 1x DH 3x recompute K_o
Light wallet (ViewReceived) - 1x DH 1x recompute K_o

Here "Light wallet" refers to a wallet that downloads data from a FilterReceived wallet service. The ViewBalance tiers need to recompute each K_o three time to detect self-send enotes.

To get an idea about the required bandwidth and CPU time, I'm estimating 256 bytes of data per view tag match, 50 μs of CPU time for DH and 50 μs of CPU time to recompute K_o (recomputing K_o needs 3 fixed-base scmults, which are about 3-4x faster than variable-base scmults for DH).

Wallet type bandwidth/day CPU/day
Full wallet (ViewBalance) depends on tx volume depends on tx volume
Full wallet (ViewReceived) depends on tx volume depends on tx volume
Light wallet (ViewBalance) 180 KB <144 ms
Light wallet (ViewReceived) 180 KB <72 ms

So even when opening a light wallet after 1 month, sync times should be on the order of a few seconds regardless of future transaction volumes.

Practical issues

How does the sending wallet figure out what view tag size to use?

Current Monero wallets already have that information. Wallets call the RPC function get_output_distribution when constructing a tx to pick decoys. This distribution contains enough information (the number of outputs in each block) to calculate the number of bits the view tag should have.

With full-chain membership proofs, wallets will still have to make a RPC call to get the current fee estimate, so that could also be used to get the current view tag size. A rough estimate could be made from the knowledge of the number of leaf nodes in the output tree.

What if a malicious sender purposely selects a shorter view tag (to cause more computation for all wallets) or a longer view tag (to reduce the recipient's light wallet privacy)?

There could be a relay rule that rejects transactions that use a view tag size other than the current or the previous one (i.e. 1 bit shorter if tx volume is growing or 1 bit longer if tx volume is dropping). It could also be enforced by consensus, but that seems like an overkill.

@jeffro256
Copy link

jeffro256 commented Sep 14, 2023

Shouldn't the calculation be DH_3 = d_e * G = 1/(d^j_a * b) * D_e?

I personally think we should set the target to 1 enote false positive per blocktime (2 minutes) to confound timing attacks. If there's a view tag hit almost every single time that a block is submitted, I imagine this would mitigate a lot of timing attacks for low wallet usage. That's about 2.8x what you're proposing, but that's still very doable today, and since its a constant throughput, compute and bandwidth will quickly catch up.

I'm liking this proposal, and just have one more modification: a three byte fixed-size view tag for DH_2. This 1) makes full wallet scanning faster and not dependent on sender-submitted fields (the view tag width), 2) also speeds up light-wallet client side scanning as a byproduct, and 3) most importantly, completely nukes the incentive for a light wallet user to hand over their "filter-received key" else they will have no normal enote privacy. Con: enotes are 3 bytes bigger.

@tevador
Copy link
Author

tevador commented Sep 14, 2023

Also shouldn't it be DH_3 = d_e * G = 1/(d^j_a * b) * D_e?

I'm using B to denote the Curve25519 base point. B = ed25519_pk_to_curve25519(G).

Second, I personally think we should set the target to 1 enote false positive per blocktime (2 minutes) to confound timing attacks. If there's a view tag hit almost every single time that a block is submitted, I imagine this would mitigate a lot of timing attacks for low wallet usage. That's about 2.8x what you're proposing, but that's still very doable today, and since its a constant throughput, compute and bandwidth will quickly catch up.

Yes, the target could be higher than 256. I chose 256/day as it matches an 8-bit view tag with the current tx volume. The lower bound for the target is 144/day to hide when an output is spent soon after the 10 block lock time. The upper bound is only limited by the bandwidth cost for light wallets.

a three byte fixed-size view tag for DH_2. This 1) makes full wallet scanning faster and not dependent on sender-submitted fields (the view tag width), 2) also speeds up light-wallet client side scanning as a byproduct.

I'm not really sure if this is worth the ~25-70 ms of CPU time per day it would save.

  1. most importantly, completely nukes the incentive for a light wallet user to hand over their "filter-received key" else they will have no normal enote privacy

Did you mean "view-received key"?

@jeffro256
Copy link

jeffro256 commented Sep 15, 2023

I'm not really sure if this is worth the ~25-70 ms of CPU time per day it would save.

Here "Light wallet" refers to a wallet that downloads data from a FilterReceived wallet service. The ViewAll tiers need to recompute each K_o twice to detect self-send enotes.

You actually need to try 1 + <number of self-send types> times. For the current Jamtis code in seraphis_lib with PLAIN, DUMMY, CHANGE, & SELFSPEND enote types, this is 4 total K_o re-computations per filter-received enote hit, which might end of being not insignificant for total scan-time. However, since this cost doesn't scale up over time with dynamic view tags, I guess that I'm more okay with it as long as there's no address tag hint to tempt people to disclose the view-received private key.

I'm using B to denote the Curve25519 base point. B = ed25519_pk_to_curve25519(G)

Ah okay I thought B was D^j_ua (AKA DH Base).

Did you mean "view-received key"?

Yes I did, sorry.

Note the inverted usage of d^j_a, which simplifies enote recovery.

I do really like this feature, and AFAIK, inverting the address private key in the address, not in balance recovery, is orthogonal to all of the previous discussed changes, which is nice.

D^j_4 = (1 / d^j_a) * B

I like the simplicity of this, but if we're missing some sort of d_ua unlock-amounts factor, then we can't have tier(s) which identify transactions that we're involved in (by recomputing K_o) without knowing the amounts, which makes cold/hot/hardware wallet separation more private, but just as convenient. And since we're using the x25519 curve for this portion of the protocol, we can cache the value of d_ua * B and then multiply by d^j_a to get s^sr_2, and it's all just as performant.

To expand on the last point, we could have all secret keys (besides cipher-tag) below the view-balance secret in the derivation tree: view-received, view-sent (new key explained below), unlock-amounts, generate-address (moved out from under view-received), and filter-involved (basically the same as filter-received but the name needs an update since we use it also for outgoing). Then we can mix and match the unlock-amounts key with/without view-received and view-sent keys to create different tiers while keeping the number of operations in balance recovery the same. The new derivation tree would look like:

Private Keys

k_m (private master key)
 |
 |
 |
 +- k_vb (private view-balance key)
     |
     |
     |
     +- d_fi (private filter-involved key)
     |
     |
     |
     +- d_ua (private unlock-amounts key)
     |
     |
     |
     +- s_vs (secret view-sent key)
     |
     |
     |
     +- d_vr (private view-received key)
     |
     |
     |
     +- s_ga (secret generate-address key)
             |
             |
             |
             +- s_ct (secret cipher-tag key)

Addresses

Addresses consist of 4 public keys (just added in a factor of d_ua):

  1. K^j_1 = K_s + k^j_u U + k^j_x X + k^j_g G (unchanged)
  2. D^j_2 = 1 / (d^j_a * d_ua) * d_fi * B (filter-received -> filter-involved)
  3. D^j_3 = 1 / (d^j_a * d_ua) * d_vr * B
  4. D^j_4 = 1 / (d^j_a * d_ua) * B

Shared Secrets

There are 3 DH shared secrets:

  1. DH_1 = d_e * D^j_2 = d_fi * D_e (filter-received -> filter-involved)
  2. DH_2 = d_e * D^j_3 = d_vr * D_e (unchanged from @tevador's last post)
  3. DH_3 = d_e * B = d^j_a * d_ua * D_e (added in factor of d_ua)

The DH exchanges are used for the same normal enote high-level secrets in @tevador's post.

However, self-send enotes use a different construction for the high-level secrets (and different from before). For self-send higher level secrets, we use a combination of s_vs (view-sent secret) and d_ua (unlock-amounts key) instead of only k_vb (view-balance):

  1. s^sr_1 = H_[tau]1(s_vs || D_e || input_context)
  2. s^sr_2 = H_[tau]2(d_ua || s^sr_1)

Wallet Tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
GenAddr s_ga generate public addresses none
FilterInvolved d_fi recognize all public wallet addresses calculate view tags
ViewReceived d_vr, d_fi, s_ga all view all received enotes (w/o amounts) except for change and self-spends
ViewSent s_vs, d_fi, s_ga all view all change and self-spends enotes (w/o amounts)
HotWallet s_vs, d_vr, d_fi, s_ga, all view all received, change, and self-spends enotes (w/o amounts)
PaymentValidator d_fi, d_vr, d_ua, s_ga, all view all received enotes with amounts
ViewBalance k_vb all view all enotes, calculate key images
Master k_m all all

Sorry, this post strayed away from the view tag balancing discussion, but changing the derivation tree and self-send higher-level secrets calculations in this manner can be added to the current Jamtis proposal orthogonally to make better hot/cold wallet setups for little to no extra cost.

@tevador
Copy link
Author

tevador commented Sep 15, 2023

For the current Jamtis code in seraphis_lib with PLAIN, DUMMY, CHANGE, & SELFSPEND enote types

What is the reasoning for these types? AFAICS we only need 2 types to tell the wallet if the enote should be displayed in history or not (this could also be achieved with a 1-bit flag encrypted with s^sr_2, so only 1 extra K_o recomputation is needed).

To expand on the last point, we could have all secret keys (besides cipher-tag) below the view-balance secret in the derivation tree: view-received, view-sent (new key explained below), unlock-amounts, generate-address (moved out from under view-received), and filter-involved (basically the same as filter-received but the name needs an update since we use it also for outgoing). Then we can mix and match the unlock-amounts key with/without view-received and view-sent keys to create different tiers while keeping the number of operations in balance recovery the same.

I don't like the additional tiers between "FilterInvolved" and "ViewBalance". They give more arguments for third-party scanners to request additional private keys. Especially the "HotWallet" tier sounds very dangerous as light wallet users might be satisfied with not revealing amounts, but it actually allows the third party to identify spent outputs in the blockchain.

The missing d_ua key and the key hierarchy in my proposal was intentional to prevent any wallet tiers that could be useful for 3rd party scanning other than "FilterReceived".

Here is a comment by @UkoeHB speaking against your "HotWallet" tier: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024?permalink_comment_id=4274612#gistcomment-4274612

@jeffro256
Copy link

What is the reasoning for these types? AFAICS we only need 2 types to tell the wallet if the enote should be displayed in history or not (this could also be achieved with a 1-bit flag encrypted with s^sr_2, so only 1 extra K_o recomputation is needed).

Sorry for the confusion, PLAIN is the type for normal enotes. DUMMY, CHANGE, and SELFSPEND are the self-send types. As for the DUMMY type, @UkoeHB would probably be able to answer this question best. But to be fair, he added that type in when self-send type checks were relatively cheap (b/c of address tag hints).

I don't like the additional tiers between "FilterInvolved" and "ViewBalance". They give more arguments for third-party scanners to request additional private keys. Especially the "HotWallet" tier sounds very dangerous as light wallet users might be satisfied with not revealing amounts, but it actually allows the third party to identify spent outputs in the blockchain.

The missing d_ua key and the key hierarchy in my proposal was intentional to prevent any wallet tiers that could be useful for 3rd party scanning other than "FilterReceived".

Since its only messing with self-send secret paths and account key derivation, there is literally nothing stopping someone from doing this anyways and still interacting with everyone else in a backwards-compatible manner, and without external observers knowing (senders hopefully ;) don't know the discrete log of your account secrets, so they can't know if you multiplied by d_ua or not). Also, the difference here with the "HotWallet" tier is that it isn't some cheesed tier that only makes sense as an abused scanning tier, it has an actual use case to make people's hot wallets more private.

@jeffro256
Copy link

a 1-bit flag encrypted with s^sr_2, so only 1 extra K_o recomputation is needed

How would this work? I could understand if it was one bit in the address index

@tevador
Copy link
Author

tevador commented Sep 15, 2023

there is literally nothing stopping someone from doing this anyways and still interacting with everyone else in a backwards-compatible manner, and without external observers knowing (senders hopefully ;) don't know the discrete log of your account secrets, so they can't know if you multiplied by d_ua or not).

Such a wallet would not be compatible with other wallet software if it was using different derivation paths and additional private keys. Yes, you can't prevent someone from inventing custom wallets that allow users to lose privacy, but it should not be supported by the official software.

Also, the difference here with the "HotWallet" tier is that it isn't some cheesed tier that only makes sense as an abused scanning tier, it has an actual use case to make people's hot wallets more private.

I don't see exactly how it's useful for hot wallets. We already have the "PaymentValidator" tier that is intended as a (view-only) hot wallet. If you are using a hardware wallet, presumably you have a "ViewBalance" tier and the hardware wallet only stores the master key. Without seeing amounts, you can't prepare a transaction to be signed with the hardware wallet.

How would this work? I could understand if it was one bit in the address index

  1. You derive the self-send s^sr_1 and recompute K_o. If it matches, you have identified a self-send enote.
  2. Derive s^sr_2 to decrypt the amount and the 1-bit flag. The flag will tell you if this self-send should be displayed in the transaction history (because the user actively sent funds to their own address) or rather be subtracted from the spent amount (because it's a change enote).

Btw, in order to properly support self-sends, I think the self-send shared secrets need to include the output index in the hash, otherwise a 2-out transaction with two self-sends (1 self-spend and 1 change) would have the same shared secrets (and view tags) for both outputs. I'm not sure how it's handled in the current Seraphis library.

@tevador
Copy link
Author

tevador commented Sep 15, 2023

Here are some additions to my proposal:

View tag filter target

The filter target should be 480 enotes/day. Because the view tag filter rate must be a power of 2, this will actually result in a range from 480 to 960 enotes per day depending on the tx volume. If we "average the averages" over all possible values of tx volume, this will give a mean of 720 enote matches per day, or roughly 1 match per block, which is what was suggested by @jeffro256. I think this is close to the upper limit of what is acceptable for light wallet clients (~200 KB/day) and should provide a good number of false positives even if there was a short term drop in tx volume.

The fomula to calculate the view tag size in bits is:

tag_size = trunc(log2(3 * num_outputs_100k / 200000))

where num_outputs_100k is the total number of outputs in the last 100 000 blocks. The trunc(log2(x)) function can be easily calculated using only integer operations (it's basically the position of the most significant bit).

As an example, the value of num_outputs_100k is currently about 7.9 million, which results in a view tag size of 6 bits when plugged into the formula. With around 56000 daily outputs, there will be about 880 matches per day. If the long-term daily volume increases to about 62000 ouputs, the view tag size will be increased to 7 bits and the number of matches will drop to 480 per day.

View tag size encoding

The view tag size must be encoded explicitly to avoid UX issues with missed transactions at times when the view tag size changes. This can be done with a 1-byte field per transaction (all outputs will use the same tag size).

I'm proposing a range of valid values for the tag size between 1 and 16 bits (instead of the previously proposed 5-20 bits).

A 1-bit view tag requires num_outputs_100k > 133333. Since there are always at least 100k coinbase outputs, the 1-bit view tag would be "too large" only if there were fewer than 120 transactions per day, which hasn't happened on mainnet except for a few weeks shortly after launch in 2014.

A 17-bit view tag that would overflow the supported range would require num_outputs_100k > 8738133333, an increase of more than 1000x over the current tx volume. If this somehow happened, the number of false positives would exceed 960 per day, which would only have performance implications for light wallets, but would not cause any privacy problems.

So the proposed range of 1-16 bits is sufficient.

Complementary view tag

Regardless of the tag_size, the view tag is always encoded in 2 bytes as a 16-bit integer per enote. The remaining bits are filled with a "complementary" view tag calculated from s^sr_1, which needs a different private key.

For example, with tag_size = 6, the 16 bits would be CCCCCCCCCCTTTTTT, where T is a view tag bit and C is a complementary view tag bit. This construction ensures that only a few K_o recomputations are needed per 65536 enotes.

Wallet type For each enote For ~720 enotes/day For 1/65536 enotes
Full wallet (ViewAll) 1x DH 1x DH 3x recompute K_o
Full wallet (ViewReceived) 1x DH 1x DH 1x recompute K_o
Light wallet (ViewAll) - 1x DH 3x recompute K_o
Light wallet (ViewReceived) - 1x DH 1x recompute K_o

A 3rd party scanner would need to be provided with the view-received key d_vr in order to calculate the full 16-bit view tag for normal enotes. There are 3 deterrents against such usage:

  1. Complete loss of privacy for received payments (everything is leaked including amounts).
  2. Self-send enotes are not detected this way.
  3. The CPU savings for the light client are small (~100 ms/day at best).

@tevador
Copy link
Author

tevador commented Sep 15, 2023

Btw, in order to properly support self-sends, I think the self-send shared secrets need to include the output index in the hash, otherwise a 2-out transaction with two self-sends (1 self-spend and 1 change) would have the same shared secrets (and view tags) for both outputs. I'm not sure how it's handled in the current Seraphis library.

I'll answer myself here: this is currently solved by including K_o in the view tag hash, which is actually a better solution than just using the output index.

@UkoeHB
Copy link

UkoeHB commented Sep 16, 2023

I don't have bandwidth to respond to everything, but wanted to clarify this:

For the current Jamtis code in seraphis_lib with PLAIN, DUMMY, CHANGE, & SELFSPEND enote types

What is the reasoning for these types?

PLAIN = normal enote
DUMMY = self-send with zero amount inserted to ensure a tx has at least one self-send (a requirement added so that a remote scanner only needs to transmit key images from txs with view tag matches instead of all txs)
CHANGE = change
SELFSPEND = non-change/dummy self-send (e.g. churn), which is differentiated from change to aid bookkeeping

@tevador
Copy link
Author

tevador commented Sep 16, 2023

DUMMY = self-send with zero amount inserted to ensure a tx has at least one self-send (a requirement added so that a remote scanner only needs to transmit key images from txs with view tag matches instead of all txs)

I don't think this one is needed. It can be a CHANGE with zero amount. My calculations above assume that only 3 K_o recomputations are needed (1x PLAIN, 1x CHANGE, 1x SELFSPEND).

@tevador
Copy link
Author

tevador commented Sep 16, 2023

Wallet Tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
GenAddr s_ga generate public addresses none
FilterInvolved d_fi recognize all public wallet addresses calculate view tags
ViewReceived d_vr, d_fi, s_ga all view all received enotes (w/o amounts) except for change and self-spends
ViewSent s_vs, d_fi, s_ga all view all change and self-spends enotes (w/o amounts)
HotWallet s_vs, d_vr, d_fi, s_ga, all view all received, change, and self-spends enotes (w/o amounts)
PaymentValidator d_fi, d_vr, d_ua, s_ga, all view all received enotes with amounts
ViewBalance k_vb all view all enotes, calculate key images
Master k_m all all

I think there is an infinite number of ways how the protocol can be made more complex with more features that we think might be useful. However, we should avoid unnecessarily bloating the specs and overloading users with choices.

Let's have a look at the features of Jamtis that have clear evidence of popular demand:

GenAddr wallet tier

It has been voiced many times by merchants that providing the ability to generate addresses without view access to the wallet is important. It came up for example during the discussion about deprecating integrated addresses: monero-project/meta#299 (comment)

FilterReceived wallet tier and dynamic view tags

The poplar demand for wallets that scan the blockchain on behalf of users is clear from the existence of such services, e.g. mymonero.com. The FilterReceived tier together with the dynamic view tag size provides a solution that preserves privacy to certain extent and only has a small fixed cost over providing full view access to the wallet.

Full view access tier

There is plenty of evidence that view-only wallets that cannot recognize spent outputs and thus display incorrect balance are bad for UX.

monero-project/monero#8613
monero-project/monero#7365
https://old.reddit.com/r/Monero/comments/4ce5ui/what_is_the_use_of_view_only_wallet_when_its/

Robust output recognition

Again, there is plenty of evidence that the lookup-table based approch for recognizing owned outputs is problematic for UX and sometimes causes the wallet to miss payments.

monero-project/monero#8138
https://monero.stackexchange.com/questions/10704/accounts-got-deleted-from-the-wallet
https://monero.stackexchange.com/questions/10184/funds-received-from-subwallet-are-not-showing

Janus attack protection

The attack was described in an official advisory here: https://web.getmonero.org/2019/10/18/subaddress-janus.html

The proposed mitigation is quite problematic for UX:

Use separate wallets instead of separate subaddresses if you need to keep two different addresses completely unlinkable. Alternatively, do not notify any sender of the receipt of funds to your wallet.

Users not aware of the existence of this attack might get exposed, so I think there is a sufficiently strong case to mitigate it, even if the cost is 51 extra characters in every address.

Payment validator (ViewReceived) wallet tier

I think this feature does not have such a strong case as the above features. Its functionality can be achieved with the ViewBalance tier, with only a small loss of privacy compared to current CryptoNote view-only access (explained by @UkoeHB here). I coudn't find any evidence of merchants requesting more privacy for view-only wallets.

However, since we are on the track to implement full-chain membership proofs, this might tip the balance in favor of a wallet tier that cannot strongly identify outgoing payments.

@jeffro256
Copy link

Such a wallet would not be compatible with other wallet software if it was using different derivation paths and additional private keys. Yes, you can't prevent someone from inventing custom wallets that allow users to lose privacy, but it should not be supported by the official software.

I guess we should clarify what we're actually trying to do here: are we trying to describe an address protocol or a specific wallet design? Assuming that we use some deterministic seed phrase and/or we allow the view-balance key to be exported, people will be able to make worse wallet implementations that offer more efficient scanning. If we're trying to mitigate this issue at the addressing protocol level using incentives, we need to make this actually part of the protocol. Dynamic view tags are protocol level, and so is normal enote DH exchanges against addresses. Self-send secrets and account secret derivation really aren't. If the protections we're providing are based on telling users to pretty please not do certain actions with their keys, we can have the "official" wallet software not do these things, but we're back to square one with @UkoeHB's key offloading initial observation. Enough tempting, and users might move from the "official" wallet software to a worse implementation, and then our wallet design means nothing.

I don't see exactly how it's useful for hot wallets. We already have the "PaymentValidator" tier that is intended as a (view-only) hot wallet. If you are using a hardware wallet, presumably you have a "ViewBalance" tier and the hardware wallet only stores the master key. Without seeing amounts, you can't prepare a transaction to be signed with the hardware wallet.

How cold/hot wallet setups work today is that the hot wallet is connected to the internet and has the view key, while the cold wallet is air-gapped and has the spend key. The hot wallet first scans for incoming transactions, and then sends those to the cold wallet. The cold wallet calculates key images and sends the key images back to the hot wallet to scan for outgoing transactions. When the user wants to spent, the hot wallet collects the output distribution, sends it to the cold wallet, the cold wallet signs, then sends the signed transaction back to the hot wallet wallet, which submits to the network.

In the proposed cold/hard wallet setup, the hot wallet would only scan all involved transactions without knowing amounts and send public transaction info to the cold wallet. When a user wants to spend funds, they do input selection and ownership proofs on the cold wallet, then send the signed transaction to the hot wallet, which finishes the transaction by completing membership proofs and submitting to the network. This type of setup is orthogonal from PaymentValidator, because it can be used without a PaymentValidator wallet, but if someone wants to spend funds that are collected from a PaymentValidator, they have to have some wallet somewhere with signing capabilities. And with the "HotWallet" tier, you can export outgoing txs to a signing wallet without knowing amounts, which is especially beneficial for merchants who frequently have large amounts of income flow to turn over.

I think there is an infinite number of ways how the protocol can be made more complex with more features that we think might be useful. However, we should avoid unnecessarily bloating the specs and overloading users with choices.

This is a UX problem, and IMO out of the scope of Jamtis. Which options are given to the users is down to the exact wallet implementation, because not all wallets have to support all features. It is my opinion that we should work as hard as possible to make Jamtis as flexible as possible for all sorts of users, while minimizing risk at the protocol level by using incentives to make users have fewer reasons to bomb their privacy for the sake of efficiency. I want to reiterate that an extremely simple "solution" to scanning inefficiency is giving away the view-balance key. That is always a choice for the user, so we have to always be competing with that. Making the protocol less flexible for users will squeeze users into worse privacy tiers. I agree that if there is some "official" implementation, their shouldn't be complete freedom on the part of the users to expose certain secrets, but again, that's outside of the scope of Jamtis IMO.

Derive s^sr_2 to decrypt the amount and the 1-bit flag. The flag will tell you if this self-send should be displayed in the transaction history (because the user actively sent funds to their own address) or rather be subtracted from the spent amount (because it's a change enote).

Okay, this makes a lot of sense actually. I like this idea.

@tevador
Copy link
Author

tevador commented Sep 17, 2023

I guess we should clarify what we're actually trying to do here: are we trying to describe an address protocol or a specific wallet design?

This is a UX problem, and IMO out of the scope of Jamtis.

Jamtis specifies part of the Seraphis wallet design, which includes:

  1. The format of the mnemonic seed.
  2. How private keys are derived from the mnemonic seed.
  3. How public addresses are generated from private keys.
  4. How enotes are constructed based on public addresses.
  5. How owned enotes are recognized based on wallet private keys.

All of this must be part of the specs to avoid fragmentation of the Monero ecosystem. This way, a user can restore their mnemonic seed into any compliant wallet implementation and will see the correct balance.

For example, the original CryptoNote protocol didn't specify how the private spend key and view key are derived and as a result, we got at least two incompatible wallet designs that produce different view keys:

https://xmr.llcoins.net/addresstests.html

You may have noticed a critical difference between this style and the Electrum Style: MyMonero's Private View Key derivation is done by hashing random integer a, while Electrum Style derivation is done by hashing the Private Spend Key. This means that 13 and 25 word seeds are not compatible – it is not possible to create an Electrum Style seed (and account) that matches a MyMonero Style seed (and account) or vice versa; the view keypair will always be different.

@tevador
Copy link
Author

tevador commented Sep 17, 2023

How cold/hot wallet setups work today is that the hot wallet is connected to the internet and has the view key, while the cold wallet is air-gapped and has the spend key. The hot wallet first scans for incoming transactions, and then sends those to the cold wallet. The cold wallet calculates key images and sends the key images back to the hot wallet to scan for outgoing transactions. When the user wants to spent, the hot wallet collects the output distribution, sends it to the cold wallet, the cold wallet signs, then sends the signed transaction back to the hot wallet wallet, which submits to the network.

Current Seraphis/Jamtis design improves this flow significantly. The hot wallet is a ViewBalance tier that can see everything and allows the user to safely prepare an unsigned transaction, including the exact list of enotes to spend and all outputs. Then there is one interaction with the cold wallet, which only displays the most basic information for confirmation and then sends the ownership proofs to the hot wallet. The hot wallet then performs decoy selection, completes the membership proof and submits the transaction to the network.

In the proposed cold/hard wallet setup [...]

I don't see a strong case to support the additional tiers in the official specs.

input selection [...] on the cold wallet

Hardware wallets have a tiny screen and a few buttons, which are enough to display and confirm the intent, but are completely inadequate for a full wallet interface. The setup with a ViewBalance main wallet and a signing hardware wallet has a much more appealing UX and achieves the main goal why users purchase hardware wallets: theft protection.

@jeffro256
Copy link

Current Seraphis/Jamtis design improves this flow significantly. The hot wallet is a ViewBalance tier that can see everything and allows the user to safely prepare an unsigned transaction, including the exact list of enotes to spend and all outputs. Then there is one interaction with the cold wallet, which only displays the most basic information for confirmation and then sends the ownership proofs to the hot wallet. The hot wallet then performs decoy selection, completes the membership proof and submits the transaction to the network.

This new flow would still be possible with almost exactly the same operations if you separate one-time address recovery and amount recovery with different keys.

I don't see a strong case to support the additional tiers in the official specs.

I think this might be tied to the following argument you make later:

The setup with a ViewBalance main wallet and a signing hardware wallet has a much more appealing UX and achieves the main goal why users purchase hardware wallets: theft protection.

For the same reason why one would want theft protection, one would want to hide amounts: mitigating compromised internet-connected computing devices. If you are working hard to mitigate this threat, I don't see why it would be such a leap to assume that some wouldn't also want to also hide the balances from their assumed-to-be compromised devices, given the option.

Hardware wallets have a tiny screen and a few buttons, which are enough to display and confirm the intent, but are completely inadequate for a full wallet interface

They wouldn't have to implement any interfacing, just the transaction input selection algorithm, which is neither or a compute heavy task nor memory heavy. At any rate, this also wouldn't be a problem for cold wallets using a normal laptop/desktop/smartphone.

@jeffro256
Copy link

jeffro256 commented Sep 18, 2023

All of this must be part of the specs to avoid fragmentation of the Monero ecosystem. This way, a user can restore their mnemonic seed into any compliant wallet implementation and will see the correct balance.

The key word here is compliant. Thus far we've been trying to mitigate people moving off of a compliant wallet design to something more suitable. The only way we can do this is by using incentives, which is why it's my opinion that its fruitless to base the privacy off of doing optionally doing something with your keys, given that you can interact with the ecosystem all the same. As such, telling users to not multiply by some d_ua factor is merely a recommendation, and we should not expect the ecosystem to evolve in that manner. And since, IMO, separating the one time address recovery and amount recovery in self-send enotes provides a tangible benefit for a real use case without affecting the performance of others, we should move forward with that feature.

@jeffro256
Copy link

Regardless of the tag_size, the view tag is always encoded in 2 bytes as a 16-bit integer per enote. The remaining bits are filled with a "complementary" view tag calculated from s^sr_1, which needs a different private key.

I like the complimentary view tag as long as s^sr_1 depends upon DH_2 only, and not DH_1. The only difference I would make here is that since the length of the view tag needs to be encoded anyways with 4 bits, we can extend the complimentary view tag by 4 bits to squeeze extra performance out of the complimentary view tag without increasing transaction sizes. As for the table you provided, if we use the technique of encoding the enote type into the blinding factor, we only have to recalculate K_o once for any enote.

@tevador
Copy link
Author

tevador commented Sep 18, 2023

one would want to hide amounts

Ideally, one would want to hide everything.

Users can just share d_fr with the hot "wallet" and scan the remaining 200 KB/day on the cold wallet. Assuming a USB 2.0 interface between the hot and the cold wallet, you can easily download several months of matches per second. This way, the hot wallet doesn't even learn which outputs you own. The bottleneck would be the DH calculation in the hardware wallet, but even tiny Cortex M4 can do a few hundred per second (1-2 minutes per month of enotes, bearable for this level of privacy).

They wouldn't have to implement any interfacing, just the transaction input selection algorithm

Most wallets nowadays allow for manual input selection, which needs an interface that lists all unspent inputs and their amounts.

The key word here is compliant.

I'm pretty sure that most wallet implementations will follow the specs. "Non-compliant" is not something you want to explain to your users.

As such, telling users to not multiply by some d_ua factor is merely a recommendation, and we should not expect the ecosystem to evolve in that manner.

IMO, "users might do it anyways" is not a valid reason to put something in the specs.

The original reason why d_ua was added is explained in this comment:

In the current version of Jamtis, a find-received service and generate-address hub can combine to create a payment validator. This seems suboptimal, so I'd like to add an internal private key to the key structure.

This issue no longer applies to the proposed scheme, so I don't see a reason to keep it. Having a separate wallet tier that can identify all owned enotes without amounts facilitates the existence of scanning services that harm users' privacy. Remote scanners that don't have to deal with decoy enotes have much smaller storage and bandwidth requirements, so I see this wallet tier as the most likely one to be used. Unfortunately, it also makes the dynamic tag feature and the extra pubkey in each address redundant.

The only difference I would make here is that since the length of the view tag needs to be encoded anyways with 4 bits, we can extend the complimentary view tag by 4 bits to squeeze extra performance out of the complimentary view tag without increasing transaction sizes.

It would actually increase transaction size a tiny bit because you would need an extra byte per output, while the current proposal only needs a byte for the whole tx. I'm also not sure if allowing different view tag sizes for every input is a good idea for tx uniformity.

As for the table you provided, if we use the technique of encoding the enote type into the blinding factor, we only have to recalculate K_o once for any enote.

At least twice, because self-sends have a different shared secret.

@jeffro256
Copy link

As for the view tag filter target, when you say "target" here, do you mean in the sense that it is a relay-enforced minimum amount of filtering to do?

The filter target should be 480 enotes/day. Because the view tag filter rate must be a power of 2, this will actually result in a range from 480 to 960 enotes per day depending on the tx volume. If we "average the averages" over all possible values of tx volume, this will give a mean of 720 enote matches per day, or roughly 1 match per block, which is what was suggested by @jeffro256. I think this is close to the upper limit of what is acceptable for light wallet clients (~200 KB/day) and should provide a good number of false positives even if there was a short term drop in tx volume.

The fomula to calculate the view tag size in bits is:

tag_size = trunc(log2(3 * num_outputs_100k / 200000))

I think we should also enforce a maximum size as well for uniformity purposes, make the minimum not so lax, while also allowing wiggle room for fluctuation between signing and propagating. Let's say that we want to plan for a fluctuation of 25% in the value of num_outputs_100k between signing and submitting. Nodes could do the following:

min_relay_tag_size = round(log2(num_outputs_100k * 3 / 4 / 100000))
max_relay_tag_size = round(log2(num_outputs_100k * 5 / 4 / 100000)) 

The wallets would do the following:

wallet_tag_size = round(log2(num_outputs_100k / 100000))

When it came time for transaction verification, nodes would check that min_relay_tag_size <= wallet_tag_size <= max_relay_tag_size.
The worst case scenario for a wallet is that between signing and propagating, the enote volume over the last 100k blocks goes up or down more than 25%, and the transaction does not enter the mempool and has to be edited. Also, round(log2()) can be implemented with integer instructions portably as well, but its a little more complicated.

@tevador
Copy link
Author

tevador commented Sep 18, 2023

As for the view tag filter target, when you say "target" here, do you mean in the sense that it is a relay-enforced minimum amount of filtering to do?

No, I mean this is the target we plug into the formula. You can see that the bitsize is calculated for 2/3 of matches per block. If we could have fractional bits, the target of 480 enotes/day would be exact. The range 480-960 is an artifact of rounding down.

wallet_tag_size = round(log2(num_outputs_100k / 100000))

Since round(log2(num_outputs_100k / 100000)) = trunc(log2(num_outputs_100k / 100000)+0.5) = trunc(log2(sqrt(2) * num_outputs_100k / 100000)), you'll effectively get a range of from 720/sqrt(2) to 720*sqrt(2), which is about 509-1018. Is there a benefit to this compared to the range 480-960 I'm proposing?

I think we should also enforce a maximum size as well for uniformity purposes

My idea was that nodes would use the same formula, but would apply the formula at each of the last let's say 10k blocks and build a range based on that. Effectively, this would give you all possible view tag sizes from the last 10k blocks as valid values. Most often, this would just be a single value. This means transactions would never be invalidated unless they took more than 10k blocks between signing and propagation.

@jeffro256
Copy link

It would actually increase transaction size a tiny bit because you would need an extra byte per output, while the current proposal only needs a byte for the whole tx. I'm also not sure if allowing different view tag sizes for every input is a good idea for tx uniformity.

Okay yes, this is a much better idea... I was assuming that view tag sizes were going to be per-enote,

At least twice, because self-sends have a different shared secret.

True, my bad

Ideally, one would want to hide everything.

Users can just share d_fr with the hot "wallet" and scan the remaining 200 KB/day on the cold wallet. Assuming a USB 2.0 interface between the hot and the cold wallet, you can easily download several months of matches per second. This way, the hot wallet doesn't even learn which outputs you own.

This is a really excellent reason to not include d_ua: the ones who are paranoid enough to do this type of setup would probably be willing a couple extra minutes per month of DH calculations. Okay, I'm sold on keeping out d_ua for now.

@jeffro256
Copy link

My idea was that nodes would use the same formula, but would apply the formula at each of the last let's say 10k blocks and build a range based on that. Effectively, this would give you all possible view tag sizes from the last 10k blocks as valid values. Most often, this would just be a single value. This means transactions would never be invalidated unless they took more than 10k blocks between signing and propagation.

This covers almost everything except for slight future changes. There's three edge case scenarios in which honestly built transactions might fail to propagate on the network:

  1. The wallet's node is ahead of its other connected nodes, and its current tag_size value is a previously unseen value because of a slightly different num_output_100k. The tx will be accepted on this node, but not propagated to other mempools (depending on the exact p2p rules, this issue might be mitigated b/c nodes won't share mempool information until they are synced to the same chain height).
  2. Same as scenario #1, but the inconsistency of this node with the network is due to a reorg
  3. The node gives a wallet a num_output_100k value, then reorgs, then the wallet submits a tx with an invalid num_output_100k. The tx will not enter any mempool, but at least this time, the user will get an error message.

In addition, this solution doesn't require storing a history of allowed values (although you could make the history very small with O(1) access using a map of values -> number of instances of that value).

you'll effectively get a range of from 720/sqrt(2) to 720*sqrt(2), which is about 509-1018. Is there a benefit to this compared to the range 480-960 I'm proposing?

Not necessarily, although the range is biased towards a slightly higher degree of privacy.

@tevador
Copy link
Author

tevador commented Sep 19, 2023

The wallet's node is ahead of its other connected nodes, and its current tag_size value is a previously unseen value

I don't think we need to handle this case. It can already happen with decoys, which might still be invalid (locked) for other nodes that are behind. I don't know how it's handled currently.

reorg

This can be solved by shifting the calculation 10 blocks to the past, i.e. we will use the range [chain_tip-100009, chain_tip-10] instead of [chain_tip-99999, chain_tip] to calculate the view tag size. If there was a reorg deeper than 10 blocks, transactions could be invalidated anyways due to invalid decoys.

the range is biased towards a slightly higher degree of privacy

Actually, it is biased towards a lower degree of privacy.

The two formulas can be approximated as follows:

trunclog2(3 * x / 200000) ~ trunclog2(x / 66667)
roundlog2(x / 100000)     ~ trunclog2(x / 70711)

These approximations are accurate to within 0.01%, which is more than enough for the intended use case. trunclog2 is a very simple function that returns the index of the most significant one-bit. It can be calculated by repeated shifting and some CPUs even have a dedicated instruction for it (x86 BSR).

I ran the numbers for the two distributions and here are the results (converted to the number of false-positive matches per day):

formula min max mean st. dev. median
trunclog2(x / 66667) 480.0 960.0 720.0 138.6 720.0
trunclog2(x / 70711) 509.1 1018.2 720.0 144.8 689.1

The second formula has median lower than the mean, which means it's skewed towards smaller values. It makes sense if you think about it: both distributions have the same mean of 720 but the second one has a higher min and max, so it must be skewed.

@jeffro256
Copy link

jeffro256 commented Sep 19, 2023

If we can accept a worst case filtering deviation of 2x, versus 1.5x or 1.41x with our current ranges, then we can get a best-case aggregate deviation of 0x with user cooperation using the following scheme. For any given value of num_output_100k, consensus rules allow wallets to choose between two values of view tag filter: t_1 = trunclog2(num_output_100k / 1000000) and t_2 = t_1 + 1 (your stored history method can be used here or not). The wallet gets to choose between these two values, and the general idea is that they will pick between these two values in a ratio that makes the aggregate filtering rate close to 720/day, no matter what the value of num_output_100k is. This is how it is done:

Our aggregate filtering rate is can be defined as follows, where v is enote volume per block (this is to make calculations simpler, practically we would set v = num_outputs_100k / 100000 or some other way of smoothing this value):

F(v) = v / (w(v) * p(v) + (1 - w(v)) * 2p(v))

where p(x) = 2 ^ trunclog2(x) [the greatest power of 2 less than or equal to x]
     and v = enote volume per block
     and w(v) is a weight function with values [0, 1] between choice of tag size t_1 and t_2 for a given v

Ideally, we want F(v) = 1 (aggregate filtering rate is 1 enote per block). If we set F(v)=1 and solve for w(v), we get:

w(v) = 2 - v/p(v)

So to pick between between t_1 and t_2, wallets will generate a random value c in range [0, 1]. If c <= w(v), then wallets will pick t_1, else they will pick t_2.

This will get us close to the ideal filtering rate, 720 enotes/day, no matter what num_outputs_100k is, assuming most people cooperate. If we don't assume that, the aggregate filtering rate can swing anywhere from 360 to 1440 enotes/day.

@jeffro256
Copy link

We could optionally make the choice c deterministic as a function of num_output_100k and input_context, which would maybe increase uniformity while having the same effect for the aggregate filtering target. The downside is transactions would be a couple bytes bigger because we would need to encode num_output_100k instead of t_1 or t_2. Also, if a user can choose multiple values for input_context (e.g. by changing their ring member set), then they could brute-force c. However, that's a lot of work to encode 1 bit worth of information; there's already much, much more efficient ways to encode arbitrary data. Overall, probably not worth making c deterministic, but that's always an option.

@tevador
Copy link
Author

tevador commented Sep 20, 2023

Here is a better method that follows the target of 720 enotes/day and doesn't rely on user cooperation.

Global parameters

There is a single global parameter:

filter = max(1, 6553600000 / num_outputs_100k)

It's an integer between 1 and 65536 and should be specified in every transaction. This needs 2 bytes per tx, but it still has a slightly smaller blockchain footprint than the current Jamtis spec:

# of tx outputs current spec proposal
2 6 bytes (2x view tag, 2x tag hint) 6 bytes (1x filter, 2x view tag)
16 48 bytes (16x view tag, 16x tag hint) 34 bytes (1x filter, 16x view tag)

Enote parameters

Every enote has two "fingerprints":

  1. fingerprint1 = H("jamtis_fingerprint1" || DH_1 || K_o) % 2^16
  2. fingerprint2 = H("jamtis_fingerprint2" || s^sr_1) % 2^64

fingerprint1 is a 16-bit integer and fingerprint2 is a 64-bit integer.

View tag derivation

The 16-bit view_tag is calculated as follows:

view_tag = (fingerprint2 % filter - fingerprint1) % 2^16

Checking for a match

The view tag is checked with the following condition:

(fingerprint1 + view_tag) % 2^16 < filter

For non-owned enotes, (fingerprint1 + view_tag) % 2^16 is a uniformly distributed random number, so it will match for filter/65536 enotes on average, which simplifies to 1 enote per block if we substitute the definition of filter.

For owned enotes, (fingerprint1 + view_tag) % 2^16 equals to fingerprint2 % filter by construction, which is always less than filter. Additionally, if we know fingerprint2, we can eliminate all but 1/filter false matches by checking:

(fingerprint1 + view_tag) % 2^16 ?= fingerprint2 % filter

This gives an overall false positive rate of 1/65536 for wallets that are able to calculate both fingerprints.

@tevador
Copy link
Author

tevador commented Sep 20, 2023

I was interested to see how the dynamic view tags work in practice, so I took real blockchain data (the number of RingCT outputs in every block) from about March 2018 to August 2023 (blocks 1519796-2959795). It's a total of 1440000 blocks, interpreted as 2000 days times 720 blocks per day (in reality it was about 2004 days).

Method 1

The first method uses the "discrete" view tags based on the formula:

view_tag_size = trunclog2(num_outputs_100k / 66667)

view_tag_method1.png

We can see that the view tag size nicely follows the long-term trend of growth, growing from 4 bits in 2018 to 6 bits, with a few periods of 7 bits during high transaction volume in 2022. Around day 1700, fluctuations in the view tag size occur for about 4 days, which is the time when the explicitly encoded size would come handy.

daily_matches_method1.png

The number of false positive matches roughly follows the target of 720/day, with some significant fluctuations. Around day 400, the daily matches shot to over 2000/day, while the lowest number of matches recorded is 231 on day 1624. The average over the whole period is 780 matches/day.

Method 2

The second method uses the "smooth" view tag method described in the previous comment.

view_tag_method2.png

Here the filter rate tracks the long-term trends much more precisely.

daily_matches_method2.png

However, if we look at the number of false positive matches, there is not a huge qualitative difference from the first method. It follows the target of 720/day slightly more closely, but short-term tx volume fluctuations still make it deviate quite far. Around day 400, the daily matches also exceed 2000. The lowest number of matches is 301 on day 960. The average over the whole period is 762 matches/day.

Conclusion

The "smoother" methods of following the daily target are not much better than the simple discrete view tag due to short-term tx volume fluctuations. I therefore think that we should adopt method 1, which is simpler and better for tx uniformity.

@jeffro256
Copy link

view_tag = (fingerprint2 % filter - fingerprint1) % 2^16

Damn, that's clever! I don't know if it was intentional, but here's a cool feature of doing the view tags this way: you can't check against the view tag using only knowledge of fingerprint2; fingerprint1 acts as a random "mask" to the fingerprint2 % 2^16 value. What this means for scanning setups is that the incentive to only send d_vr instead of d_fi/d_fr is destroyed! If you only send d_vr to a light wallet server, they only thing they can do is compute s^sr_1 and nominal address tags, but they'd have to send you that information for every single enote; they can't actually weed any out. If they also had s_ct, they could 100% identify all incoming enotes, but it would require huge amounts of processing for the server (since they can't use view tags), and it still wouldn't cover self-sends.

If you were calculating fingerprint1 and fingerprint2 separately (i.e. light wallet), the server would need to send fingerprint1 unless the client wanted to do 2x DH ops instead of 1x DH ops per each filter enote, but since that's only 2 bytes, it's still much much smaller than the nominal address tag (16 bytes), which is what it would replace.

For this reason alone, I think this is the best way thus far.

Here the filter rate tracks the long-term trends much more precisely.

The "smoother" methods of following the daily target are not much better than the simple discrete view tag due to short-term tx volume fluctuations

I think the long term rate of filtering is more important anyways. When a user is scanning a small amount of volume, it won't matter much from a UX perspective if the small volume is a little bigger. It's when the scanning process would otherwise take many minutes or even hours (poor souls), that the long-term filtering rate would make a difference performance-wise.

I therefore think that we should adopt method 1, which is simpler and better for tx uniformity.

Since we ostensibly have to choose filter from a common, public, deterministic list of values, what would this non-uniformity tell us exactly? It would (maybe) reveal the time we constructed the transaction to the granularity of block-time (2 minutes). In most cases, this is already known by the nature of how transactions propagate in nodes' mempools. Discretized fees and ring member selection also leak this information. And in the future, if/when we go for FCMPs, a hash of the root of the curve tree for a given block will need to be included to verify the transaction, which would further cement for external observers when in time a transaction was constructed.

@jeffro256
Copy link

you can't check against the view tag using only knowledge of fingerprint2; fingerprint1 acts as a random "mask" to the fingerprint2 % 2^16 value

Thinking about it now, this is orthogonal to the smoothness of the view tag filtering rate, we could always include a 2-byte residue of DH_1 in the calculation of the complementary view tag.

@jeffro256
Copy link

jeffro256 commented Sep 21, 2023

With that in mind, I'd agree that doing tag_size = trunclog2(3 * num_outputs_100k / 200000) is probably best. Also, thanks for doing those simulations, that was actually really insightful!

@tevador
Copy link
Author

tevador commented Sep 21, 2023

It would (maybe) reveal the time we constructed the transaction to the granularity of block-time (2 minutes). In most cases, this is already known by the nature of how transactions propagate in nodes' mempools. Discretized fees and ring member selection also leak this information. And in the future, if/when we go for FCMPs, a hash of the root of the curve tree for a given block will need to be included to verify the transaction, which would further cement for external observers when in time a transaction was constructed.

It would leak the approximate time when the transaction was signed, which might be a long time before it's actually submitted to the network (e.g. multisig or offline-signed transactions). This might be considered to be a regression because Seraphis already allows for the membership proof to be added just prior to publishing the transaction, which removes the leaks caused by the member selection and only leaves fees as a possible leak.

It could be solved by removing view tags from the signed data and sign them later with the membership proof, but that might be problematic.

@jeffro256
Copy link

jeffro256 commented Sep 25, 2023

After working on implementing the new changes, I think the complementary view tag should be bound to DHE_2 instead of s^sr_1 (as well as a residue of the primary view tag calculation). The reason for this is that, if we make the complementary view tag a function of s^sr_1, in most cases, we will need to do the DH operation anyways, but now we're also doing 4 hash operations (derive plain s^sr_1, hash plain s^sr_1 -> complementary view tag, derive self-send s^sr_1, hash self-send s^sr_1 -> complementary view tag) instead of 1 (hash DHE_2 -> complementary view tag) for each 720 enotes/day. So in summary, this is how I think the view tag computations should go:

npbits = the number of primary view tag bits, explicitly mentioned in the transaction
ncbits = the total size of the view tag in bits - npbits
primary_view_tag || primary_view_tag_residue = H(DHE_1, K_o)
complementary_view_tag = H(DHE_2 || primary_view_tag_residue)
view_tag = primary_view_tag[first npbits] || complementary_view_tag[first ncbits]

The only downside I can think of when complementary_view_tag binds DHE_2 instead of s^sr_1 is the DLP solvers can check both view tags on self-sends transactions (versus just the primary view tag on self-sends) if they know your public address. However, they will not know the types of self-sends nor what the outgoing amounts are.

@tevador
Copy link
Author

tevador commented Sep 25, 2023

I think the complementary view tag should be bound to DHE_2 instead of s^sr_1

This has an undesirable side effect of revealing1 self-spends to anyone in possession of d_vr, e.g. a PaymentValidator tier, which would only be able to calculate primary_view_tag otherwise. Because PaymentValidator is likely to be a hot wallet with an increased risk of key compromise, I think this side effect should be avoided and the complementary_view_tag should be calculated differently for self-sends. The cost of this can be just 1 extra hash, which is negligible compared to the 2 DHE calculations that precede it.

1 Technically, it reduces the false-positive rate to 1/65536, which is about 6 false matches per week with current tx volume.

@jeffro256
Copy link

Ah shoot you're right, I wasn't thinking about that tier... nevermind.

@jeffro256
Copy link

What is the reasoning for these types? AFAICS we only need 2 types to tell the wallet if the enote should be displayed in history or not (this could also be achieved with a 1-bit flag encrypted with s^sr_2, so only 1 extra K_o recomputation is needed).

We can't do this because if you have a 2-output tx (using a shared xK_e), and one of your outputs is a self-spend and the other is a change output, Ko will be shared between the enotes which will 1) reveal that this is a tx where a user is trying to churn and 2) burn funds for one of the enotes.

So we will need to do 2x extra Ko re-computations (3 total) for each enote that matches both view tags.

@tevador
Copy link
Author

tevador commented Sep 27, 2023

We can't do this because if you have a 2-output tx (using a shared xK_e), and one of your outputs is a self-spend and the other is a change output, Ko will be shared between the enotes

This can be fixed by including the output index in the shared secret calculation. That would make both outputs have unique K_o.

In fact, the same problem applies to normal enotes. If someone sends a 2-out tx where both outputs go to the same address, both outputs will have the same K_o. This is non-standard, but AFAIK the protocol allows it. This shows that the input_context that only consists of key images is insufficient to ensure the uniqueness of K_o and fails to prevent "the burning bug".

@jeffro256
Copy link

In fact, the same problem applies to normal enotes. If someone sends a 2-out tx where both outputs go to the same address, both outputs will have the same K_o. This is non-standard, but AFAIK the protocol allows it.

The protocol doesn't allow it though. One of the rules of Jamtis is that every transaction contains at least one self-send output (for this reason, as well as allowing third-party light wallet servers to trim the key image set and give the clients access to their outgoing transactions). If you have 2 normal outputs to the same destination, and need at least one self-send, that means you wouldn't be doing the shared xK_e optimization.

@jeffro256
Copy link

Also, having K_o bound to the tx output index would be really annoying (AKA involve brute-forcing private ephemeral keys) since IIRC the enotes in Seraphis are ordered by one-time addresses.

@tevador
Copy link
Author

tevador commented Sep 27, 2023

The protocol doesn't allow it though. One of the rules of Jamtis is that every transaction contains at least one self-send output

How is this rule enforced?

Imagine the following scenario:

Mallory registers at an exchange and is provided with a deposit address. She crafts a 2-output transaction without change, sending both outputs to the deposit address, each output worth 1000 XMR. In order to do this, she needs to provide inputs with a total sum of exactly 2000 XMR + fee, but that should not be hard to do.

Unless the exchange has a wallet that is aware of the burning bug, Mallory will be credited with 2000 XMR and can proceed to withdraw the funds back to her custody. However, the exchange will later realize that only one of the 1000 XMR outputs can be spent. This scam can be repeated until the wallet of the exchange is completely drained. It only costs some tx fees.

Relying on all wallet implementations to be able to detect this bug is not going to work, so there are basically two solutions:

  1. Mandating unique K_o within each transaction as a consensus rule.
  2. Including the output index when deriving K_o.

IIRC the enotes in Seraphis are ordered by one-time addresses

Is this a consensus rule or just a recommendation for tx builders?

@jeffro256
Copy link

jeffro256 commented Sep 27, 2023

How is this rule enforced?

That rule specifically is not enforced at a consensus level, it's just a Jamtis rule-of-thumb that is derived from the Seraphis protocol consensus rule that enote outputs within a transaction should be ordered and unique by one-time address. See this code for details and implementation: https://github.com/UkoeHB/monero/blob/eeca802ccee217d26acd8bc89ee69bbd3c47e254/src/seraphis_main/tx_validators.cpp#L365-L367.

In this way, and assuming that input_context differs from transaction to transaction, all cases of the burning bug should be covered.

@tevador
Copy link
Author

tevador commented Sep 27, 2023

consensus rule that enote outputs within a transaction should be ordered and unique by one-time address

Cool. It's the first time I hear about this rule. Maybe it's worth adding it to the Seraphis specs?

The current "Implementing Seraphis" paper says the following:

To further ensure uniqueness within a transaction, transaction verifiers must mandate that all values K_e in a transaction are unique.

Uniqueness of K_e is not sufficient to prevent the burning bug as shown above.

@jeffro256
Copy link

Uniqueness of K_e is not sufficient to prevent the burning bug as shown above.

That's true and a good thing to point out more explicitly in the spec. I can open an issue on that repo to clarify that passage.

@jeffro256
Copy link

jeffro256 commented Sep 28, 2023

This brings me to an interesting privacy hiccup when distributing xk_fr to a third-party, under both the new and old schemes: depending on the size of the previous view tag / primary view tag, a third-party will see that outgoing transaction enotes are exponentially more likely to to be owned by a user the more self-send enotes there are. This affects both light wallets and people using the payment validator tier.

We can reason that the number of successful view tag checks within a transaction unrelated to you follows a binomial distribution. Each view tag check is a Bernoulli trial, so we can expect the number of successful view tag checks X for a transaction with n outputs to follow the distribution X ~ B(n, VTFP), where VTFP is the view tag false positive rate. The probability mass function for getting k view tag matches can be written as P(X = k) = (n choose k) * VTFP^k * (1 - VTFP)^(n-k). As an extreme example, someone may implement PocketChange-like feature which breaks up outputs to help users work around the 10-block lock. Let's say they create 16 self-send outputs and the false positive rate is 1/256. All 16 outputs will be matched by view tag, which should normally only have a probability of 2.939 x 10^-39 (the same chance as randomly guessing someone's AES key). This can also happen with 2-output transactions with one self-spend and one change, although not as severe: the probability should be 1/65536.

We need a way to have third-parties scan the information they need without this privacy downside. I propose that we split up the self-send types into three self-send types: SELF_SPEND, PLAIN_CHANGE, & AUXILIARY_CHANGE. When doing an outgoing transaction, enote types SELF_SPEND XOR PLAIN_CHANGE (one or the other, not both) will always be present. For these enotes, primary view tags will calculated as normal. For any additional desired self-sends, we set the primary view tag to random bits and the self send type to AUXILLIARY_CHANGE, but do everything else the same (meaning binding the self-send type to s^sr_1). When it comes time to scan, we also scan all enotes in transactions in which any of the view tags matched even if their view tag did not match (hence "auxiliary"), but only scan them for type AUXILIARY_CHANGE. Unfortunately, this change will more than double the bandwidth required for light wallet clients, but only marginally affect compute time as no extra DH ops are required, and depending on the complementary view tag size, most enotes won't have to have K_o recomputed.

@jeffro256
Copy link

jeffro256 commented Sep 28, 2023

This wouldn't have to slow down non-auxiliary enote scanning at all (besides an extra amount commitment recomputation on an already confirmed owned enote) due to the following reason: since we assume that exactly one of SELF_SPEND or PLAIN_CHANGE is present in a transaction, they can share the same s^sr_1 derivation, and only have s^sr_2 derivation differ (this avoids the problem of sharing xK_e leading to the same K_o). The s^sr_1 derivation for AUXILIARY_CHANGE would differ, which leaves us with the same number of K_o re-computations that we have to do: 1x for plain check and 2x for self-send check.

For any additional desired self-sends, we set the primary view tag to random bits...

To speed up auxiliary enote scanning, we could actually fill the primary view tag bits up with all complementary view tag bits, since we don't care about it matching anyways, but we're also going to check the complementary view tag.

@tevador
Copy link
Author

tevador commented Sep 29, 2023

Responding here to a reddit comment.

My first major issue with them is that they break one of the big improvements promised by Seraphis, that being the ability to sign a transaction, let it sit, and then broadcast it later by creating the membership proof just before being shipped off.

Dynamic view tags behave exactly like dynamic fees in this regard. When signing a tx, you already commit to the chain state by selecting a fee amount. If the dynamic fees adjust upwards before the tx is submitted, you might have to mine the tx yourself as it won't be relayed. The same applies to dynamic view tags, with the minor difference that the tag size can adjust both upwards and downwards. The dynamic tag size will adjust very infrequently in typical situations. With current chain history, the last adjustment would have been about a year ago (chart). You can mitigate the risk for pre-signed transactions by signing two versions with the 2 most likely view tag sizes.

Second is that it makes Monero even more complex.

That's a non-argument. All new features make Monero even more complex. The question is if the complexity is worth the benefits it brings. For dynamic view tags, I think the answer is yes if we're already introducing an extra public key in every address just to support 3rd party scanning. If we don't adopt dynamic view tags, I think we should revert back to the original Jamtis design with 3 public keys as it seems like a better compromise between complexity and privacy with 3rd party scanning.

@kayabaNerve
Copy link

kayabaNerve commented Sep 29, 2023

@tevador I can't personally support dynamic view tags if dynamic view tags are part of the signed blob and requires re-signing to adjust the size of them. It's very different from creating a TX, saving a 2-byte view tag, publishing it as 1-byte, then rebroadcasting it as 2-byte if necessary, than re-signing entirely.

If they're not part of the signed blob, then they lose their integrity, as anyone can frontrun a TX with invalid view tags in the mempool.

Accordingly, I'm unable to voice support for this complexity due to the practical issues it'd cause (not just complexity which may cause practical issues).

I will also note that while I'm not up to date on JAMTIS, I leaned towards adding an extra key per @jeffro256. What I'd most like however is not to decide on whether or not to have an extra key, yet to have a complete spec document considered up-to-date (not with hundreds of errata comments) and final barring:

  1. Incredibly minor tweaks (DST choices, round counts)
  2. Major issues found. I would not call any issues in view tag tiers major unless they fundamentally invalidate the tier.

Though I'm sure this desire to be finite is well shared, meaning my statement of it may not contribute.

@tevador
Copy link
Author

tevador commented Sep 29, 2023

I want to reiterate my view that the proposed change to extend Jamtis addresses to 4 public keys to improve 3rd party scanning might be wasteful without dynamic view tags.

The static 8-bit view tag works well with a "medium" transaction volume, which is a range from mid tens of thousands to mid hundreds of thousands of enotes per day. We're presently near the bottom of this range. Within this range, the 8-bit view tag provides both good filtering and sufficient anonymity.

However, if a bear market hit and the tx volume plummeted for some reason, then there would be nearly no privacy advantage compared to the 3-key variant of Jamtis. Similarly, if Monero is successful and the tx volume goes up by 2 orders of magnitude, light wallets might be forced to switch to a less private wallet tier to reduce the bandwidth and computation costs. This would also remove any privacy advantage of the 4-key variant.

In both of these scenarios, 3rd party scanning will suffer a privacy loss and we'll be stuck with longer addresses and bloated specs.

Note that even the 3-key variant of Jamtis significantly improves 3rd party scanning. Currently, light wallet clients have to give up their private view key and leak practically all of their transaction history to the scanning server. With 3-key Jamtis, light wallet clients would only give up their "find-received" private key, which will reveal only some incoming transactions (e.g. recurring payments to the same address) without amounts to the scanning server.

A major advantage of 3-key Jamtis is that the 3rd party scanning improvements come "for free" because they are simply a byproduct of Janus attack protection provided by the 3rd public key, so even if 3rd party scanning doesn't catch on, we won't be wasting anything.

@kayabaNerve
Copy link

👍

I can't react to comments, apparently, so I'm forced to leave a new post. I hear you, that all sounds sane, and I have no further comments to contribute at this time.

@jeffro256
Copy link

jeffro256 commented Sep 29, 2023

I think what would solve all these issues is a arbitrary-size-by-concensus (with a reasonable limit, e.g. 24 bits) fixed-size-by-relay view tag. It's just as simple to implement because it does not depend on chain data. A cold signed tx won't temporally be invalidated unless you hold it so long that relay rules change (which is already an issue for fees). The view tag can be part of the signed blob without the need for multiple signings. We can adjust for really low and/or current tx volume if the anonymity set gets dangerously small. Conversely, if there is a large outcry from users that the bandwidth/computational requirements are unmanageable, we can manually increase the size (this really shouldn't happen more than a once every several years if at all, since we can assume that most user's machines / network connections will get at least slightly better year over year). Attackers cannot affect the view tag size by spamming the chain. All in all, we would reap the benefit of fixed-size view tags' linear increase in privacy with transaction volume and general robustness, but we could have a community handbrake if things got bad.

I think that we're all trying really hard to look ahead into the future and predict what tech trends will be like and what user's reactions to them will be and we could sit here all day postulating different user's different rationales for doing things, and create a decent solution for that specific use-case. Ideally, we want something that is both flexible and simple, and I think that making the view tags arbitrary size by consensus, but fixed size by relay is the best way to do that.

@expiredhotdog
Copy link

When signing a tx, you already commit to the chain state by selecting a fee amount

Sure, but that's not quite the same. You can always just set an aggressively high fee rate which practically guarantees that it'll work, whereas the view tag size would have a specific range enforced by consensus (I assume). Unless we take a different approach like @jeffro256 's proposal.

You can mitigate the risk for pre-signed transactions by signing two versions with the 2 most likely view tag sizes.

That seems like an incomplete solution, and isn't guaranteed to work in case of a large surge in volume, whether malicious or not.

The other option would be to have the tag itself signed normally, but its "accuracy level" bundled with the membership proof. That way you can overshoot the number of bits while constructing the transaction, but set its "claimed" accuracy immediately before broadcasting. The downside would be potentially allowing a 3rd party scanner to filter more accurately.

But it might not make much of a difference depending on how many extra bits you fill in: even 4 extras would be expected in 1/16 matches just by random chance, which isn't really that bad, considering how many matches there would already be per day. Especially since not all transactions will use this method.

Maybe this has been brought up before, and if so, then... whoops my bad.

@tevador
Copy link
Author

tevador commented Sep 29, 2023

view tag size would have a specific range enforced by consensus

No, the dynamic tag size would be a relay rule, just like fees.

isn't guaranteed to work in case of a large surge in volume

The view tag size is calculated based on the last 100 000 blocks. Even a large surge in tx volume will take weeks to affect the tag size. Note that during hard forks, we give old transactions only 24 hours to be confirmed before they are permanently invalidated (v9, v11, v14 and v16). The view tag adjustment would never permanently invalidate transactions, it would only make it harder, but not impossible, for them to be mined.

@kayabaNerve
Copy link

I'd be fine with a relay rule so long as on-change, the prior and new value are both valid for a period of 24 hours.

@expiredhotdog
Copy link

expiredhotdog commented Sep 30, 2023

No, the dynamic tag size would be a relay rule

Okay, so not a consensus rule. However it still would, effectively for most users, make it an unusable transaction.

we give old transactions only 24 hours to be confirmed

The 24 hour grace period isn't really an issue since it only happens once every few years and is known in advance, compared to (potentially) within weeks. I think this is a pretty significant tradeoff, potentially not worth the benefits.

Signing multiple different versions of the transaction does work, but it's very much a bandaid-type solution which we should try to avoid.

@tevador
Copy link
Author

tevador commented Sep 30, 2023

I'd be fine with a relay rule so long as on-change, the prior and new value are both valid for a period of 24 hours.

The current proposal has the following:

  1. 10-block delay between the calculation and the view tag size taking effect. This prevents short reorgs from reverting view tag size changes. Longer reorgs will already invalidate transactions due to decoys.
  2. 10000-block (2-week) grace period after each change when both the previous and the current tag sizes are relayed.

However it still would, effectively for most users, make it an unusable transaction.

There is a big difference between invalidated-by-consensus and invalidated-by-view-tag presigned transactions. The former one is useless as you can never hope for the old consensus rules to be restored. The latter can be mined in two situations:

  1. If tx volume changes back and the view tag size again matches the one used in the tx.
  2. The user solomines the tx themself. Due to the existence of hashpower rental services, this option is available to anyone and probably worth it if the presigned tx is valuable enough.

The 24 hour grace period isn't really an issue since it only happens once every few years and is known in advance

If a 24-hour grace period known 1-2 months in advance (e.g. the v11 fork date was finalized 25 days prior to the fork) to permanently invalidate presigned transactions is acceptable, then I think a grace period of a few weeks to temporarily invalidate presigned transactions is also fine.

@jeffro256
Copy link

jeffro256 commented Oct 31, 2023

Here's my attempt at distilling weeks of conversations into a nice table of different contentious Jamtis proposals crossed against properties, so others don't have to read all those comments.

Abbreviation Table

Abbreviation Term
XXK Extra Exchange Key
VT View tag
ATH Address Tag Hint
LW Light Wallet
LWS Light Wallet Server
BCR Bandwidth & Computation Requirements

Quick Summary Table of Contentious Diffie-Hellman-Related Jamtis Proposals

Current XXK XXK + 2 fixed-size VTs - ATH XXK + "dynamic" VT - ATH XXK + "flexible" VT - ATH
Public Address Size 196 247 244 244 244
Fixes Nominal Address Tag Privacy Issues?
Can do Delegated Public Address Generation?
Transaction Size Change None None None +1 byte/tx +1 byte/tx
Doesn't Need Recent Chaindata for VT constr.?
Balances LW BCR Automatically?
Community can change LW BCR w/o fork?
LW Anon Set (Against LWS) Increases w/ Volume?
Scan Speed Change (w/ 8-bit primary VT) 0% -0.4% -0.4% -0.4% -0.4%
Post-Primary VT CPU Time Required For Scanning 1x 100x 100x 100x 100x

What the Rows Mean

  • Public Address Size - The character count for the human-readable Jamtis address you give to sender
  • Fixes Nominal Address Tag Privacy Issues? - Under this scheme are you protected against:
    • A light wallet server identifying incoming enotes with 100% accuracy if they know your public address
    • A light wallet server identifying incoming enotes with 100% accuracy if the public address is sent to more than once
  • Can do Delegated Public Address Generation? - Can a third-party generate addresses on your behalf without any additional loss of privacy?
  • Transaction Size Change - self-explanatory
  • Doesn't Need Recent Chaindata for VT constr.? - Are transaction constructors free from needing up-to-date chain info specifically for constructing view tags?
  • Balances LW BCR Automatically - Does the amount of enotes matched by a LWS stay relatively constant over long time periods so LW BCRs don't increase over time?
  • Community can change LW BCR w/o fork? - Can the community manually change the view tag match rate to meet current user demands?
  • LW Anon Set (Against LWS) Increases w/ Volume? - Does the set of transactions per time period that the LWS knows your wallet is limited to increase with transaction volume?
  • Post-Primary VT CPU Time Required For Scanning - Compared to current Jamtis, the CPU time required to scan for incoming enotes after performing the primary view tag check is 100 times more. This is because each Twofish decipher op is replaced by a x25519 scalar multiplication op. This means that light wallet clients will do ~100 times more CPU work than before, but the bandwidth needed remains the same.

Summary of "Auxiliary" Enotes

Refer to this comment for the auxiliary enote proposal. This change can apply to any of the above proposals, including current Jamtis. Basically, when a transaction has too many self-send enotes, that fingerprints that tx as owned by a LWS. The proposal would allow LW users to churn and to create pocket change without any additional loss of privacy. The downsides is that if any enote matches a primary view tag, all other enotes in the transaction must be attempted to be scanned against the AUXILIARY_CHANGE enote type. For full wallets, this only entails a 1+ extra hash operation slowdown every time an enote matches a primary view tag, but for light wallets, these extra enotes must ALSO be sent "over the wire", also increasing bandwidth requirements.

Difference between "Dynamic" and "Flexible" View Tags

On-chain, both will be serialized as a fixed size buffer of 3 (optionally 2?) bytes per enote. Additionally, there is one integer per transaction, called npbits, constrained to range 0-24 (16 if view tag buffer only 2 bytes wide), which encodes the number of bits from the front of the buffer used to match the primary view tag. Likewise, ncbits is the number of bits from the back of the buffer used to match the complementary view tag, and it is calculated as ncbits = 24 - npbits. The difference between "dynamic" and "flexible" is how the value npbits is enforced. Under the "flexible" scheme, the value npbits is set to a constant value, enforced by relay rule. Under the "dynamic" scheme, the value npbits is enforced to be a function of the on-chain transaction volume. See this comment for the exact proposed formula. Neither of these schemes cause uniformity issues because at transaction construction time, there is only one correct value to choose for npbits. What's nice about the flexible vs dynamic debate is that, as long as npbits is enforced only by relay rule, the community can switch back and forth between the two proposals as it sees fit without forking.

"Less-Contentious" Shared XXK Tweaks

Here's some extra details about tweaks accumulated through discussion that are shared for all XXK proposals and aren't hotly debated, but might be still worth a mention:

  • Suggested by @tevador
    • Get rid of unlock-amounts key in order to help thwart identify-involved fake tier
    • Make one DH privkey depend upon other in secrets derivation tree for same reason
  • Suggested by @jeffro256
    • Bind second/complementary view tag to "residue" of primary DHE in order to help thwart identify-involved fake tier
    • Bind the amount baked key to the first Diffie-Helman exchange to prevent probabilistic Janus attack

@tevador anything I'm missing?

@Gingeropolous
Copy link

so does column XXK still have a -0.4% scan speed change, even though the column name doesn't include VT?

@jeffro256
Copy link

jeffro256 commented Nov 1, 2023 via email

@Gingeropolous
Copy link

So then what is the point of viewtags? With the whole auxiliary enotes thing, it seems that viewtags have a critical flaw that is being hacked around by auxiliary enotes, which seems like added complexity for no gain if the XXK column has the same -0.4% scan speed change without VT.

@j-berman
Copy link

j-berman commented Nov 8, 2023

View tags

I think a flexible view tag is a reasonable option.

  • Relayers start by enforcing a 1 byte view tag.
  • Consensus allows dynamic view tags.
  • If volume starts to increase significantly and sustainably AND light wallets are widely used AND the UX for light wallets starts to degrade AND the use case of submitting presigned txs many weeks in advance ends up an edge case that is outweighed by the light wallet use case, then relayers start enforcing a dynamic view tag.

A major advantage of 3-key Jamtis is that the 3rd party scanning improvements come "for free" because they are simply a byproduct of Janus attack protection provided by the 3rd public key, so even if 3rd party scanning doesn't catch on, we won't be wasting anything.

I think adding a 4th key to the address is still worth it, even if a dynamic view tag is never implemented, because I would rather users be in a situation where an extra key ends up wasted as opposed to a situation where their privacy is worse than it could otherwise be.

On auxiliary enotes

I think the benefits of this are worth it especially with full chain membership proofs. Pocket change is a use case people seem to clearly want (even despite its privacy issues today), to the point where I can see it being a reasonable default wallet behavior with full chain membership proofs.

So then what is the point of viewtags?

@gingeropouls in the current Jamtis spec (and in this matrix of proposals), its primary value-add is offloading the bulk of scanning to a server while preventing the server from being able to know all enotes the user received and spent with cryptographic certainty. It also speeds up full wallet scanning basically the same as view tags do today (i.e. when you don't give up a key to the server).

As currently proposed, if you send yourself e.g. 9 enotes in a tx (e.g. if you were to use a pocket change-like feature), and you give your "find-received" key to a server that can only identify view tag matches of txs which may belong to you, then the server could see "hey, this user had 9 view tag matches in this tx which is statistically very unlikely, therefore the user almost certainly received all 9 enotes." The auxiliary enote proposal above is strictly to ensure even pocket change-like txs would only have a single view tag match among all 9 enotes, so the server would still identify the tx as one that the user needs to scan all enotes for.

@jeffro256
Copy link

Two points about Janus attacks under the new proposed scheme:

  1. It is possible to do a Janus attack on any address when the attacker knows any one address private key. Let's say the attacker knows address private key kja and the Jamtis address corresponding to that address private key. This address private key may be revealed, for example, during an address index proof. The attacker generates ephemeral pubkey Ke = r Kibase (i is index of a new dest to be attacked). The attacker uses the ephemeral key based on an address index i, but actually encrypts the addr_tagj from the old Jamtis address j. We then do the rest of enote building from the address j. When it comes to make the amount baked key, the receiver will calculate kja * Ke. The attacker will know what this value is supposed to be since he knows kja. He can then calculate the "correct" amount baked key, by doing kja * Ke instead of r * G, and therefore, calculate the "correct" ssr2. We can fix this by including a factor of our account secrets in our base key (e.g. the view-received key). This is the reason this attack doesn't work on Jamtis currently: Kj3 contains a factor of kua, which the attacker wouldn't know.
  2. Including Kdaf in the hash of the amount baked key doesn't actually do anything to prevent Janus attacks, since the attacker knows what it should be. Also it isn't needed as long as we can prove that the ephemeral key is "bound" to a certain address index j, since otherwise, we won't reach the same shared secret.

@jeffro256
Copy link

This might seem like a trivial change, but I suggest that we remove the 'a' character from the address header. Two reasons: 1) we make addresses one character shorter (duh), but second is more important: 2) this prefix scheme falls in line with Bech32, litecoin, etc where you have characters in the ticker followed by a version number. I can see confusion possibly arising when people generate an address with the letter 'a' inside the prefix: "Huh? What is XMRA? I don't want XMRA, I want XMR!" initiate frustration

@rbrunner7
Copy link

I suggest that we remove the 'a' character from the address header.

After some searching I found out what that "a" stands for in the first place, see this comment: The letter there has two possible values, "a" for "anonymous address" and "c" for "certified address". There was quite some discussion here over many comments whether such certified addresses are a good idea, ok but overkill, or even a bad idea; I did not go through it all. But I am sure the decision to remove that "a" amounts to deciding whether we support this address distinction in the proposed form, or at least will in the future, with more extensive tooling.

If I was to decide alone I would probably let that stand without much further ado and research ...

@kayabaNerve
Copy link

I don't believe it'd be an issue to have xmr1 vs xmrcert1 and prefer xmr alone.

I also don't believe we should have multiple distinct address in general (though I'd have to double check the certified address discussion). If there's no active plans to support certified addresses now, I'd remove "a".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment