Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active May 22, 2024 21:58
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@UkoeHB
Copy link

UkoeHB commented Jun 14, 2022

Follow-up on the fund-burning comment: the issue still exists for users relying on third-parties (remote nodes or remote scanners) who can send the user funds with in invalid input_context (i.e. duplicated from another enote owned by the user). The third-party can send the user a malicious enote with very tiny amount and duplicated onetime address (and invalid input_context). If that enote is spent, then the original enote will be unspendable.

To mitigate this, I will update one-time addresses to include a hash of the amount commitment. This way, to burn funds you have to at least burn exactly the same amount as the real enote sent to the user. It also eliminates a sub-class of this attack where the malicious third party doesn't know the amount in the real enote (nor the amount commitment's blinding factor).

Incidentally, this resolves my disagreement with @tevador about including C in the view tag (it will be transitively included via the one-time address).

@UkoeHB
Copy link

UkoeHB commented Jun 17, 2022

In the current version of Jamtis, a find-received service and generate-address hub can combine to create a payment validator. This seems suboptimal, so I'd like to add an internal private key to the key structure.

unlock-amounts key: k_ua = Hn(k_vb)

K_1 = unchanged
K_2 = k_fr * K_3 (unchanged)
K_3 = k^j_a * k_ua * G

  • The payment validator needs s_ga, k_fr, k_ua, k_vb X + k_m U to view incoming funds.
  • The generate-address hub needs s_ga, k_ua G, k_fr k_ua G, k_vb X + k_m U to make addresses.

I am presenting balance recovery at the Monerokon this weekend, so hopefully this last-minute change will work out ok.

@tevador
Copy link
Author

tevador commented Jul 19, 2022

I want to open a discussion about increasing address tags from 56 || 8 bits for j || MAC to 128 || 16 (8 bytes to 18 bytes).

I think this is most likely an overkill.

  1. We only need to future-proof the design to match the expected lifetime of the cryptographic primitives used by Monero (Curve25519, Keccak-256), which is less than 100 years. For example, the 48-bit MAC addresses have a design lifetime of 100 years according to the IEEE (source, page 13). Our address index is 8 bits more and doesn't need to be globally unique. 56 bits is plenty.
  2. The performance benefit of a 16-bit MAC is negligible. For examples, if the shared secret calculation takes 40 μs, Blowfish cipher 100 ns and the remainder of the output recovery 1 ms (exaggerated on purpose), then you'll get the following scanning times per output:
optimization avg time per output
none 1040100 ns
8-bit view tag 43907 ns
8-bit view tag, 8-bit MAC 40016 ns
8-bit view tag, 16-bit MAC 40000 ns

The 16-bit MAC saves negligible 16 ns per output (0.04%).

  1. Random address generation sounds appealing, but IMO not enough to justify the increased address and transaction size. Collisions can be easily avoided by partitioning the 56-bit space in a similar way as the 48-bit MAC address space is partitioned (e.g. you can have 17M independent generators if you assign a 32-bit space to each of them).
  2. Working with the (7+1)-byte indices in the code is probably less complicated than encrypting the (16+2)-byte index that doesn't match the block size of any block cipher.

@UkoeHB
Copy link

UkoeHB commented Jul 19, 2022

  1. "For example, the 48-bit MAC addresses have a design lifetime of 100 years according to the IEEE (source, page 13)." This citation is not very encouraging, considering they spend two pages talking about how to optimize use of the address space.
  2. I agree the "performance benefit of a 16-bit MAC is negligible" when talking about a full-node scanner whose computation costs will be dominated by the view tag check. I disagree when talking about the client of a remote scanner. Using your numbers, an 8-bit MAC means the post-MAC check will be 39x more expensive than the cipher on average (in practice it's probably closer to 4x). With a 16-bit MAC, it would be 0.15x on average (~0.02x in practice). My goal is for the cipher to dominate costs for the client of a remote scanner.
  3. Tbh, I am really not enthusiastic about a bunch of spaghetti all over the ecosystem that exists just to handle the address index being too small for blind random generation. This just makes Monero harder to use confidently.
  4. Yeah it's more complicated - but only like 50 lines of code more complicated than 8 bytes.

@tevador
Copy link
Author

tevador commented Jul 19, 2022

  1. My point was that MAC addresses work with a global 48-bit space. We'd have a 56-bit address space per wallet.
  2. Thanks for the clarification. Do we expect 3rd party scanners to be common? Does saving 5 seconds per wallet sync for a subset of Monero users justify the costs?
  3. You can generate about 380 000 random 56-bit indexes before the collision chance exceeds 10-6. Merchant wallets are probably the only ones who would benefit from a larger index space, which would allow a simpler implementation for them. Does this justify the costs of having a larger index for all users?

@UkoeHB
Copy link

UkoeHB commented Jul 19, 2022

Do we expect 3rd party scanners to be common? Does saving 5 seconds per wallet sync for a subset of Monero users justify the costs?

The cost for this is 1 byte per tx output, which is like <0.1% of a tx. If it materially improves the UX for >0.1% of users, is that a justification?

Merchant wallets are probably the only ones who would benefit from a larger index space, which would allow a simpler implementation for them. Does this justify the costs of having a larger index for all users?

With smaller address indices, I wouldn't be surprised to see the return of payment IDs. Not exactly a good outcome (and not one that can be easily repaired after the fact).

@tevador
Copy link
Author

tevador commented Jul 19, 2022

The proposed 56-bit address index is not materially different from the current 64-bit encrypted payment ID.

Even if we decided to adopt a larger index, I don't think the 16+2 configuration you are proposing is the best one. Something like 15+1 or 14+2 seems much better. That would save you one round of the block cipher. Random collisions are not a problem with 112/120 bit values. Note that v4 UUIDs have "just" 122 random bits.

@UkoeHB
Copy link

UkoeHB commented Jul 19, 2022

The proposed 56-bit address index is not materially different from the current 64-bit encrypted payment ID.

Not quite, because the current encrypted payment ID is 64 bits PLUS the subaddress table (which could be up to ~16 bits).

Even if we decided to adopt a larger index, I don't think the 16+2 configuration you are proposing is the best one. Something like 15+1 or 14+2 seems much better. That would save you one round of the block cipher.

You only need one round of the block cipher to test the MAC, so 1 vs 2 rounds has a negligible performance impact. I also considered a 16-byte version, but 16 bytes for the index is a nice round number - an unambiguous solution that doesn't raise the question 'is this over-engineered?' from the user's point of view (we can happily engineer like crazy with anything hidden from users, like overlapping block ciphers or really anything in the Monero protocol).

@tevador
Copy link
Author

tevador commented Jul 19, 2022

Not quite, because the current encrypted payment ID is 64 bits PLUS the subaddress table (which could be up to ~16 bits).

AFAIK "integrated subaddresses" are not supported.

16 bytes for the index is a nice round number - an unambiguous solution that doesn't raise the question 'is this over-engineered?' from the user's point of view (we can happily engineer like crazy with anything hidden from users

The address index is hidden from users. The user will just see the address RID. The UI can have something like a "Generate new address" button. The user will not select an index. There is also no reason for merchants to use the address index directly, The API can report either the RID or some other computed value that identifies the address.

@busyboredom
Copy link

busyboredom commented Jul 19, 2022

If merchants only need collision resistance over concurrent invoices then I imagine 56 bits should be plenty for any merchant besides maybe amazon, right?

Maybe if a merchant decides to use address indexes as unique invoice IDs in a database of all past purchases then it would be a problem, is that a use-case we want to support? My payment gateway lib (AcceptXMR) uses the tuple (address index, creation height) as the invoice ID so it wouldn't affect me, but I could imagine a naive merchant may assume the address index alone is sufficient and run into problems.

@kayabaNerve
Copy link

2^50 can handle 160k orders per person on the planet. There's only a 1/256 chance of a random collision even at that quantity with just 58 bits.

@UkoeHB
Copy link

UkoeHB commented Jul 19, 2022

Ok math huh...

Let's look at an upper bound (using 56-bit address indices):

  • 10mill people each generate 2^10 = 1k addresses in their lifetime -> 0.015% that one has a collision (increases to ~99% if you remove 2 bytes for the account number and use a small set of accounts)
  • 100k orgs each generate 2^17 = 131k addresses in their lifetime -> 2.4% that one has a collision
  • combined -> 2.4% that at least one has a collision (assuming the individuals don't subdivide their addresses into accounts)

Equation (hopefully I got this right...): Probability at least one has a collision = 1 - (1 - 2^-(num index bits - num addresses bits)))^(num addresses bits x num individuals)

We should have a strong safety factor even at the upper bound. Doesn't seem like we have that here.

AFAIK "integrated subaddresses" are not supported.

Ah you are right, I forgot.

The address index is hidden from users. The user will just see the address RID. The UI can have something like a "Generate new address" button. The user will not select an index. There is also no reason for merchants to use the address index directly, The API can report either the RID or some other computed value that identifies the address.

A 'user' in this context is anyone trying to support Monero to do something (merchant, wallet, etc.). A user of the protocol API, so to speak.

@tevador
Copy link
Author

tevador commented Jul 20, 2022

A user of the protocol API, so to speak.

The API doesn't need to expose the address index. We can pack the "ugly" internal address index into a neat-sized identifier for the API. The internal index can be "over-engineered" since it's hidden from the user.

If we use X25519 for the keys K2 and K3, the three address keys will need 256+255+255=766 bits. If we want the address data to be optimally packed, we should aim for the addr_tag_size such that (addr_tag_size + 766) % 5 == 0, which would not waste any bits in the base32 encoding.

Additionally, for easy binary serialization, we might also require that addr_tag_size % 8 == 0.

This leaves three possible addr_tag_size values: 64, 104 and 144 bits.

addr_tag_size block cipher addr_index + mac_size Jamtis addr. length Transaction size
64 Blowfish 56+8 180 characters +8 bytes per output
104 Blowfish + CTS 92+12 188 characters +13 bytes per output
144 Blowfish/Twofish + CTS 128+16 196 characters +18 bytes per output

In case of the 104 and 144-bit tags, we should use Ciphertext stealing (CTS) for the encryption, which is a standardized way of encrypting data that is not a multiple of the block size. Compared to the implementation by @UkoeHB, CTS swaps the order of the last two blocks in the ciphertext.

Also it's probably better to use Blowfish even for the 144-bit tag because it's about 3x faster than Twofish to decrypt the MAC (the whole tag would need 3 blocks).

@tevador
Copy link
Author

tevador commented Jul 20, 2022

I realized that since we are using a block cipher in the ECB mode, encrypting more than 1 block can leak information. Ideally, we would need to include an IV, which would increase the size of the encrypted tag considerably.

@UkoeHB
Copy link

UkoeHB commented Jul 20, 2022

In case of the 104 and 144-bit tags, we should use Ciphertext stealing (CTS) for the encryption, which is a standardized way of encrypting data that is not a multiple of the block size.

Ok this method works, although it is significantly less intuitive than my approach (while being functionally equivalent). Since it is a standard technique, I guess that's a reason to use it...

I realized that since we are using a block cipher in the ECB mode, encrypting more than 1 block can leak information.

Yes, hence why I switched to Twofish. With no IV, you need the entire address index to fit in one block (if two address indices have identical bits that map to a given ciphertext block, then the corresponding address tags will have duplicated blocks, which leaks information if you have multiple blocks). According to your chart, that leaves us with the 64-bit and 144-bit options.

@tevador
Copy link
Author

tevador commented Jul 21, 2022

The consensus seems to be in favor of a larger index to accomodate random address generation.

However, there still isn't a good argument for the 16-bit MAC.

Checking output ownership in Jamtis involves the following steps:

# calculation CPU cycles note
1 nominal derived key 120 000 X25519 variable base scalarmult
2 view tag hash 800 Blake2b
3 shared secret hash 800 Blake2b
4 address tag mask 800 Blake2b
5 address tag block cipher 300 Twofish
6 key extension derivation 800 Blake2b
7 k_j X 29 000 ed25519 fixed base scalarmult
8 point addition 500

The cycle counts are based on my own benchmarks using Ryzen 3700X, so they should be fairly accurate.

The first step will dominate the calculation for a user scanning their own copy of the blockchain. The address tag MAC has almost no effect there.

Users connecting to a remote service can skip steps 1-4, which will be performed by the server.

The following table shows the average cycle counts to perform the output ownership check from step 5 for various sizes of the MAC in bits:

MAC size avg cycles/output required bandwidth
0 31 000 10 MB/s
3 4 000 70 MB/s
6 800 400 MB/s
16 300 1 GB/s

The "bandwidth" column shows the approximate rate at which outputs must be loaded to cause a CPU bottleneck. This is calculated with 100 bytes per output and 1 CPU thread running at 3 GHz.

This shows that even without a MAC, the performance of the remote scanner will be bottlenecked by network bandwidth. With a 6-bit MAC, the check is fast enough to bottleneck a local SSD storage device.

The 16-bit MAC doesn't make sense even in the unlikely case that network bandwidth grows significantly faster than CPU performance in the future.

I'm therefore proposing to use a 128-bit address tag with either a 6-bit or a 3-bit MAC.

  1. The 6-bit MAC will allow us to map the index onto a v4 UUID, which contains 122 random bits. This UUID can be directly exposed in the API. V4 UUIDs are ubiquitous and have a wide support in many programming languages. I think this is the best choice.
  2. In case we wanted to support also v1, v2, v3 and v5 UUIDs, we can reduce the MAC size to 3 bits. However, these UUIDs are used rarely.

The 128-bit tag also avoids the non-standard block cipher operations.

@UkoeHB
Copy link

UkoeHB commented Jul 22, 2022

However, there still isn't a good argument for the 16-bit MAC.

I'd rather over-optimize for performance than for size. Instead of trying to find 'just the right ideal amount of bits' for the MAC, why can't we go a little over-board (by a number of bytes that is insignificant in context) to cover all possibilities both known and unknown with a safety factor that can't be questioned?

In case we wanted to support also v1, v2, v3 and v5 UUIDs, we can reduce the MAC size to 3 bits. However, these UUIDs are used rarely.

I seriously question designing toward any particular use-case. Use-cases are only examples of ways the index can be used that can highlight weaknesses in the design. Why does the size of UUIDs highlight a weakness of 56-bit indices, supporting the case for more bits? It's not because I have a burning desire to support UUIDs, it's because you need more bits to have robust collision-resistance (which is why UUIDs are so big in the first place). In other words, choosing 128-bit indices is designing toward a standard level of robust collision resistance that naturally supports use-cases that rely on robust collision resistance.

The failure of 122-bit indices to support non-v4 indices highlights that a non-standard size for the index is not generic.

Therefore: generic + robust collision resistance = 128 bits for the index.

The 128-bit [tag] also avoids the non-standard block cipher operations.

If it makes you happier, what about doing a block cipher for a 128-bit index, concatenated with halfsiphash[trunc_16(s_ct)](ciphered index) truncated to 2 bytes? Siphash is a keyed hash function after all, and my goal with the overlapping cipher was just to mimic a cheap keyed hash function for the MAC, without adding more dependencies. Although if the overlapping cipher is actually somehow problematic (very dubious), idk how this use of Siphash wouldn't also be problematic.

@tevador
Copy link
Author

tevador commented Jul 24, 2022

I disagree, but I'm not going to argue about it anymore.

It's much more important that we adopt X25519 for the shared secret calculation, which will allow 30% faster wallet sync for most users.

@tevador
Copy link
Author

tevador commented Aug 6, 2022

I have completed my X25519 library (took a bit longer than expected).

https://github.com/tevador/mx25519

It's a collection of 4 different implementations for optimal performance on most devices.

Here are some quick benchmarks comparing mx25519 with the ref10 Monero code and the assembly implementation from supercop. Results are in CPU cycles.

Machine Monero (ref10) Supercop (amd64-64) mx25519
Raspberry Pi 3 530 000 N/A 146 000
Core i3 3220 436 000 220 000 176 000
Ryzen 3700X 383 000 156 000 117 000

The biggest performance leap is for ARM devices (over +350%). It might be possible to run a full wallet on a smartphone with this kind of performance.
Older x86 CPUs also get a decent +25% performance boost compared to the current supercop assembly code.
Newer x86 CPUs can benefit from the extended instruction sets for a +33% gain compared to supercop.

@tevador
Copy link
Author

tevador commented Aug 8, 2022

The mx25519_invkey function has a tiny chance of failure (~2-124 for a random set of keys).

This failure can occur in step 10 of output recovery, when the wallet has to calculate r' G = Ke / (kaj * kua), so mx25519_invkey will be called with keys kaj , kua.

This failure can be handled as follows:

Each of the keys kaj , kua can be individually rejected if it fails the mx25519_invkey function (kua when generating a wallet, kaj when generating an address). Then later if mx25519_invkey(kaj , kua) fails, the shared secret can be recovered by inverting each key individually and performing two scalar multiplications instead of one, because r' G = (kaj * kua)-1 Ke = (kaj)-1 (kua)-1 Ke

However, the failure may not even need to be handled at all. The chance is so tiny that random RAM corruption occurs at a much higher rate and you will get duplicate 128-bit address indices with a higher chance after just a few generated addresses. It might be enough just to throw an exception.

@tevador
Copy link
Author

tevador commented Aug 8, 2022

I'd like to propose another change to the specs.

The current key hierarchy is vulnerable because if some of the more powerful keys are compromised (for example, due to a faulty signature implementation), all keys in the hierarchy below are also leaked in the process.

This is the new key heirarchy:

c_0
 |
 +- k_s (private spend key)
 |
 +- c_1
     |
     +- c_2
     |   |
     |   +- k_vs (private view-spent key)
     |   |
     |   +- s_vc (secret view-change key)
     |
     +- c_3
         |
         +- c_4
         |   |
         |   +- d_fr (private find-received X25519 key)
         |   |
         |   +- d_ua (private unlock-amounts X25519 key)
         |   
         +- c_5
             |
             +- s_ga (secret generate-address key)
             |
             +- s_ct (secret cipher-tag key)

This heirachy follows the best practice of using each key only for a single purpose. It introduces intermediate nodes c_0 to c_5, which are only used to generate the portion of the key tree below them.

To generate keys, a 512-bit hash is needed, for example Blake2b. Each of the intermediate nodes is hashed to generate the two values below, e.g.:

(k_s, c_1) = Blake2b("c_0", c_0)
(c_2, c_3) = Blake2b("c_1", c_1)
etc.

Additional changes:

  • Renamed k_vb to k_vs ("view-spent" key). This key now serves only to generate linking tags.
  • Introducing a new symmetric key s_vc "view-change key" that is used to generate the shared secret for self-send e-notes. This avoids using the same key in two different contexts.
  • Renamed k_fr and k_ua to d_fr and d_ua to show that those are private keys in a different format (for X25519 Diffie-Hellman).

The full tree above will apply to newly generated wallets. Legacy standard wallets will keep k_s and set c_1 = k_s. Legacy multisig wallets will keep k_s and set c_1 = k_v.

This structure also naturally generates the individual wallet tiers:

  • "Master" wallet needs c_0
  • "View All" wallet needs c_1 and one public key.
  • "View Received" wallet needs c_3 and one public key.
  • "Find Received" wallet needs d_fr.
  • "Generate Address" wallet needs c_5 and 3 public keys.

@tevador
Copy link
Author

tevador commented Aug 8, 2022

Legacy standard wallets will keep k_s and set c_1 = k_s

Actually, I don't see a reason why the Seraphis spend key would need to be the same as the legacy spend key. It's probably better to set c_0 = k_s. This will also mean that the legacy seed and Polyseed are handled the same way.

@UkoeHB
Copy link

UkoeHB commented Aug 8, 2022

Each of the keys kaj , kua can be individually rejected if it fails the mx25519_invkey function (kua when generating a wallet, kaj when generating an address). Then later if mx25519_invkey(kaj , kua) fails, the shared secret can be recovered by inverting each key individually and performing two scalar multiplications instead of one, because r' G = (kaj * kua)-1 Ke = (kaj)-1 (kua)-1 Ke

It is not necessary to reject k^a_j or k_ua. Here is a no-fail algorithm that falls back to 'searching' for a successful inversion.

mx25519_pubkey enote_ephemeral_pubkey;  //K_e = r * K^3_j = r * k^a_j * k_ua * G
mx25519_privkey reverse_DH_privkey;  //1/(k^a_j * k_ua * modifier)
mx25519_pubkey amount_baked_pubkey;  //r G = modifier * reverse_DH_privkey * K_e
static const mx25519_privkey eight = 8;

std::vector<mx25519_privkey> inversion_keys;
inversion_keys.reserve(2);
inversion_keys.emplace_back(k^a_j);
inversion_keys.emplace_back(k_ua);

while (mx25519_invkey(&reverse_DH_privkey, inversion_keys.data(), inversion_keys.size()) != 0)
{
    inversion_keys.emplace_back(eight);  //add 8 to keys to invert (modifier = 8*8*...*8)
    mx25519_scmul(&impl, &enote_ephemeral_pubkey, &eight, &enote_ephemeral_pubkey);  //K_e = 8 * K_e
}

mx25519_scmul(&impl, &amount_baked_pubkey, &reverse_DH_privkey, &enote_ephemeral_pubkey);

It will take me some time to look at your other recent comments properly.

@tevador
Copy link
Author

tevador commented Aug 9, 2022

while (mx25519_invkey(&reverse_DH_privkey, inversion_keys.data(), inversion_keys.size()) != 0)
{
inversion_keys.emplace_back(eight); //add 8 to keys to invert (modifier = 88...*8)
mx25519_scmul(&impl, &enote_ephemeral_pubkey, &eight, &enote_ephemeral_pubkey); //K_e = 8 * K_e
}

Cool. That works.

Btw, I recommend using MX25519_TYPE_PORTABLE when constructing a transaction, since it's not performance-critical and the portable code is less likely to have bugs. When scanning for incoming transactions, MX25519_TYPE_AUTO should be used.

@UkoeHB
Copy link

UkoeHB commented Aug 21, 2022

Introducing a new symmetric key s_vc "view-change [secret]" that is used to generate the shared secret for self-send e-notes. This avoids using the same key in two different contexts.

Tying self-send access directly to the root view key prevents anyone from setting up a remote scanning service that can discover both self-sends and normal enotes without key images or amounts (which you would be able to do with an s_vc secret). Such a tier would be superior to the find-received tier, from a remote scanning service-provider's point of view, because A) you don't have to store nominal address tags for 1/256 of all enotes for every user (probably around 25-30 bytes per view tag match), B) you don't have to transmit view tag matches to users during scanning (around 230 bytes per match + key images), C) the privacy cost seems pretty small relative to the find-received tier.

We designed the find-received tier so service-providers would NOT feel compelled to use a stronger scanning tier. Currently, the next best tier for a remote scanner is full-view (payment validator is stronger but doesn't make scanning more efficient because you still need to save all view tag matches to find self-sends). It's the big privacy gap between find-received and full-view that is supposed to tip the balance in the cost/benefit analysis for remote scanning service providers.

We must not underestimate the potential long-term damage that an s_vc secret could do. Our first priority is to user privacy, user convenience is secondary.

@tevador
Copy link
Author

tevador commented Aug 21, 2022

If that's your concern then I guess we can set s_vc = c_2. Then providing s_vc would be the same as providing k_vs/k_vb, which is not something we can prevent anyways.

@UkoeHB
Copy link

UkoeHB commented Aug 21, 2022

In that case, you still have a view tier that can find received + spent without amounts. Maybe s_vc = c_1. At that point I question what practical differences your proposal has to the existing scheme. 1) wallet tiers store a little less secret data on average, 2) s_ct no longer derives from s_ga, 3) k_vs is introduced (which could be combined with c_5 and k_m U to identify spent normal outputs, which would be enough to identify self-sends).

Overall, the existing scheme is elegant, concise, and well-optimized for the various design considerations that have come up.

@tevador
Copy link
Author

tevador commented Aug 21, 2022

  1. wallet tiers store a little less secret data on average, 2) s_ct no longer derives from s_ga, 3) k_vs is introduced (which could be combined with c_5 and k_m U to identify spent normal outputs, which would be enough to identify self-sends).

Those are all side-effects. The main purpose of the change was this:

The current key hierarchy is vulnerable because if some of the more powerful keys are compromised (for example, due to a faulty signature implementation), all keys in the hierarchy below are also leaked in the process.

@UkoeHB
Copy link

UkoeHB commented Aug 21, 2022

Ok, well I don't see a way to achieve your purpose without adding key combinations that a remote scanning service could implement to the detriment of user and network privacy.

@tevador
Copy link
Author

tevador commented Aug 22, 2022

If the child key leakage is a "feature" rather than a bug, then I guess we can leave it as it is. But it's probably going to come up again in a future audit of Seraphis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment