Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active May 22, 2024 21:58
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@UkoeHB
Copy link

UkoeHB commented Jan 29, 2022

@tevador I am going to change the view tag computation (for Jamtis). From: v = H1("view tag", Kd). To: v = H1("view tag", Kd, Ko). This fixes an edge case Janus attack.

I removed output indices from output computations, to allow 'make an enote in isolation then chain a tx off it'. Suppose A) someone makes an enote sending money to me, B) I make a tx funding just that enote (so there will be that enote + a change enote). Both of the outputs in that tx will have the same view tag, because there will be the same derived key K_d (due to the efficiency magic around 2-out txs). One thing you could do is, if there is one non-change output, always check if it is a 'normal self-send' (i.e. not a special self-send, just a normal enote that happens to be owned by you). But this check opens up a Janus attack.

If the enote in step (A) weren't sent to me, then most of the time it wouldn't have the same view tag as my change output. An attacker could test if an address is mine by making isolated enotes, getting me to fund them, then checking if the view tags match the change outputs (via me rejecting/accepting the proposed isolated enotes).

Adding Ko to view tags solves that because Jamtis change enotes and normal enotes always have different onetime addresses (even if the underlying destination address is the same).

Note that 'self-spend' enote types (non-change self-sends) will only be created by a tx author, so they can't be used for a Janus attack (hence it is safe to rely on a check is_self_send_proposal() when deciding how to create a change output). On second thought, there is one tiny edge case that allows Janus: if you make an isolated enote as a self-send, give it to someone else anonymously, then that person gives it back to you to fund (using the appearance of a 3-out vs 2-out tx as indication if the isolated enote was a self-send that you created). However, you shouldn't hand out self-send enotes that you might not fund, since if you don't fund them then you will never find them in the chain.

Note: An alternative solution would be 'if someone gives you a single isolated enote, then always add a dummy enote'. However, that isn't super robust, and reduces privacy by preventing you from hiding isolated enotes in the more-common 2-out pool.

@busyboredom
Copy link

I propose we hear more opinions and re-consider it :)

My opinion is that this should be a UI issue.

  • Software with a focus on non-merchant use (like Cake Wallet) should default to anonymous addresses and hide certified addresses in an "advanced" menu (or just not support them at all).
  • Software with a focus on merchant use (like the WooCommerce plugin and whatnot) should default to certified addresses and should warn the merchant about their linkability.

No need for separate wallet types in Jamtis, it can just be a UI thing. I do agree with the spirit though -- casual users don't need to see the option to use certified addresses unless they're looking hard for it.

@j-berman
Copy link

j-berman commented Jan 30, 2022

I think everyone is under-appreciating these 3 very negative tradeoffs that certified addresses make:

  • every single Monero wallet, 3rd party wallets included, has to develop software that specifically supports sending to both certified addresses AND anonymous addresses
  • users will likely need to learn and understand why and how certified are different from normal addresses = cognitive overhead for customers at the point of sale, one of the merchant's most critical points (and the most critical point Monero is supposed to provide an excellent experience for)
  • they reduce plausible deniability that you control an address. I don't agree with @tevador 's arguments, but, it appears we will agree to disagree on that.

I think the idea to create an address type that is not different from any other regular address and does not have the above tradeoffs, but retains the benefits merchants apparently want, is far superior.

If the main concern is development concern and a blow-up of features that people have to support, I am happy to take it on myself and implement it totally separately and have all code live separate from the user-facing wallet code i.e. implement the merchant wallet. There shouldn't be anything that prevents this. In fact I might even do it if certified addresses also exist because I feel that strongly. But by choosing to go with certified addresses as the solution here, we are putting a larger burden on every Monero wallet developer ever.


No need for separate wallet types in Jamtis

Technically you could still have one wallet that supports both. But the complexity and development resources needed with this alternative address type to certified addresses is larger (I'm gonna name this alternative "merchant addresses").

From a dev standpoint, I think it's a much more prudent decision to allow for the wallet types to be silo'd and allow wallet devs to choose to support their desired UX based on expected users, rather than require every single wallet dev support an extra feature (sending to certified addresses).

I.e. anyone can choose add the merchant address feature to their wallet so users can receive payments to merchant addresses if they want to, but the likely maintenance and dev resource cost would be large. Versus certified addresses, which require ALL wallet devs to implement sending to certified addresses, which isn't as complex to implement as implementing the ability to receive to merchant addresses, but still requires resources to implement and maintain and build a UX for.

@tevador
Copy link
Author

tevador commented Jan 30, 2022

It is evidence, and stronger evidence than one without a signature.

I don't agree that certified addresses provide meaningfully more evidence than reused integrated addresses. So we can agree to disagree on that.

However, I seem to have convinced you that certified addresses don't prove ownership, so that's at least one thing we can agree on.

every single Monero wallet, 3rd party wallets included, has to develop software that specifically supports sending to both certified addresses AND anonymous addresses

Sending to a certified address works exactly the same way as sending to an anonymous address. The only thing that is different is the RID calculation. I estimate that the support for sending to certified addresses would be under 20 lines of code.

Having this feature included in the official specs means that merchants can rely on it being widely supported. If this was some 3rd party extension, it would be unusable in practice.

users will likely need to learn and understand why and how certified are different from normal addresses

No, they will not. Users will only need to care about the RID. If we wanted to prevent users from shooting themselves in the foot by using a certified address, we could restrict their generation to the wallet RPC interface.

I would appreciate some feedback from an actual merchant about certified addresses. If merchants say they don't need this feature, we can remove it.

@j-berman
Copy link

j-berman commented Jan 30, 2022

The only thing that is different is the RID calculation. I estimate that the support for sending to certified addresses would be under 20 lines of code.

It requires a dev to learn what they are, then find the relevant code to write, and to make sure their UI can handle larger (and variable length) address strings nicely, write tests for both. But sure I would estimate it at a few hours of work tops maybe less.

Users will only need to care about the RID.

You can't avoid copy pasting addresses in all cases. Having one address 50% longer than another is an oddity that takes thought to understand and not find strange.

we could restrict their generation to the wallet RPC interface

Ok that's something

If merchants say they don't need this feature, we can remove it.

I'm not arguing for removal of a feature that would enable what merchants want, I'm arguing for an alternative. Can you explain why you are against the alternative?

(For the record I am also a merchant, service I helped start accepts Monero)

@tevador
Copy link
Author

tevador commented Jan 30, 2022

Can you explain why you are against the alternative?

I haven't seen an alternative proposal, but I'm assuming you mean the return to the previous version of the specs which had a separate merchant wallet type. This wallet type would derive the address key K1 and the address tag t the same way as "non-merchant" wallets, but the other two keys K2 and K3 would be constant. RID would be derived by hashing only K2 and K3.

Advantages:

  • Uniform address length
  • Marginally simpler code on the sending side
  • The disputable privacy benefit of not signing anything. I claim that there is no difference.

Disadvantages:

  • Users would have to understand the difference between the wallet types and decide at wallet generation time if they want privacy or customer assurance. This could be somewhat mitigated by restricting the merchant wallet type to the wallet RPC interface, but merchant wallets would still offer strictly worse privacy.
  • Possible issues with legacy wallets. The same wallet could be restored once as a merchant wallet and later as a non-merchant wallet, causing two different sets of addresses to be produced.
  • More complex address and wallet generation code
  • The assurance provided by RIDs would be significantly weakened because an attacker could replace K1 or t without changing the RID. This might cause the funds sent to the modified address to be permanently lost, which is something that cannot happen with an integrated address or the current Jamtis proposal.

The complete removal of certified addresses from the current proposal would have the same advantages but none of the disadvantages (the only disadvantage would be the loss of customer assurance). The above proposal seems to be the worst of the 3 options.

@busyboredom
Copy link

busyboredom commented Jan 30, 2022

I can think of one alternative that I just put on matrix and now realize that this is a better place to say it:

Gingeropolous floated the idea of a txpool-only "payload system" earlier: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024?permalink_comment_id=4044233#gistcomment-4044233

It sounds similar to the chat system used in Townforge. A nice alternative to messaging over tx_extra.

Merchants could put something like "DON'T EDIT: f6kfd984jdls9" as the payload to link the transaction to a given invoice, allowing the merchant to reuse a single anonymous address instead of needing to generate a bunch of certified addresses. We'd get a unified address type and ephemeral messaging at the same time. It does seem quite a bit more complicated than certified addresses though, and would require merchants to see the message before expiration (so if their server went down, they'd miss the payment). Not sure if it's the best alternative, but it is an alternative.

@j-berman
Copy link

j-berman commented Jan 30, 2022

These disadvantages are part of why I argue it would be reasonable to silo merchant wallet types from normal user wallet types. A user who uses Venmo doesn't need to learn what QuickBooks is for example. I disagree it is the worst of the 3 options, and those advantages are significant.

I think the 4th disadvantage of weaker RID is of marginal concern in line with the weakened plausible deniability.

But I think we have fairly examined and highlighted the tradeoffs at this point, and have fully shared our opinions.

I concur more feedback from others would be useful to move forward from here.

@j-berman
Copy link

Also, the RID could still be a hash of all 3 pub keys, but clients can recognize the same constant pub keys via a contact they store after first visit.

@tevador
Copy link
Author

tevador commented Jan 30, 2022

I think the 4th disadvantage of weaker RID is of marginal concern

Another thing we disagree on. I find the 4th disadvantage the most concerning, up to the point of making RIDs essentially useless for the purpose they were designed to do. The DNS/onion/manual verification would make little sense if it only covered part of the address payload.

I would prefer to remove certified addresses than destroy the anonymous functionality provided by RIDs. The merchant wallet type could then be implemented independently as a 3rd party extension. Users would just have to compare part of the address instead of the RID to get some level of assurance.

@j-berman
Copy link

@busyboredom there would still need to be a protocol where merchant gives extra info to customer, customer has to pass that extra info into wallet, wallet has to parse and send the tx and the extra info to the messaging network. So it's effectively a different address type like integrated addresses, but without storing extra data on chain and instead sending to the messaging network. I think integrated addresses would be my preference over that because in this case the ephemeral nature of it is a pretty significant downside.

@j-berman
Copy link

Fair enough @tevador

I apologize for some of the testier comments in my debating. I think you make reasonable points, we just disagree on relative costs and benefits 🤝

@busyboredom
Copy link

effectively a different address type

I wouldn't consider invoices to be a separate address type. Invoices already exist and simply provide relevant information (like amount due) along with the address. Currently, they're even human readable. The only change here is that the "tx_description" field would be sent to the txpool when the transaction is sent. Here's an example of an invoice I just generated using the GUI wallet: monero:4A1WSBQdCbUCqt3DaGfmqVFchXScF43M6c5r4B6JXT3dUwuALncU9XTEnRPmUMcB3c16kVP9Y7thFLCJ5BaMW3UmSy93w3w?tx_amount=1&tx_description=for%20pizza&recipient_name=John%20Smith

I think integrated addresses would be my preference over that because in this case the ephemeral nature of it is a pretty significant downside.

I believe the ephemeral nature is justified by reduced blockchain bloat compared to integrated addresses, but it is certainly worth debating.

You are definitely correct that needing to send the description to the txpool introduces some complexity though, that's a very fair criticism. Some new logic would need to be added to both the wallet and daemon. If there is a simpler alternative that doesn't take up space on the blockchain, I'd be on board with it.

@tevador
Copy link
Author

tevador commented Jan 31, 2022

In the view tag PR, @knaccc brought up an interesting point that curve25519 x-coordinate DH exchange seems to be significantly faster than ed25519 key exchange.

I think this is definitely something we should investigate for jamtis. We could modify addresses so that the public keys K2 and K3 only encode the x-coordinate of the Montgomery curve points. This would speed up output scanning for two reasons:

  1. No need to recover the second coordinate from the compressed point (this typically requires a slow field inversion square root operation).
  2. x-coordinate scalar multiplication seems to be much faster because the y-coodinate operations can be skipped

@UkoeHB
Copy link

UkoeHB commented Jan 31, 2022

@tevador I didn't think that far, good idea. In that case, {K_2, K_3, K_e, K_d, r G} can all be in x25519. Maybe if @j-berman gets a demo running, then I can use it to adapt my implementation to demo this.

@UkoeHB
Copy link

UkoeHB commented Jan 31, 2022

@tevador If only K_d is in x25519 for view tags only (K_d for computing q remains in ed25519), then it is possible for implementations to not support x25519 and just do slow scanning. This might be advantageous.

@tevador
Copy link
Author

tevador commented Feb 1, 2022

I ran some preliminary benchmarks using the amd64_64_24k assembly implementation from SUPERCOP on my Ryzen 3700X.

operation clock cycles
x25519_varbase_scalarmult 138500
ed25519_varbase_scalarmult 155200

So the speed up is not 2x as previously reported but still significant 12%. Note that this is for a 64-bit assembly implementation. The difference may be larger with the ref10 implementation.

In both cases, the above figures include point deserialization, variable base scalar multiplication and point serialization. However, it seems that the main advantage of x25519 is the point deserialization. If I remove the call to ge25519_unpackneg_vartime from the ed25519 benchmark, it runs in 140600 clock cycles - almost the same performance as x25519.

The conclusion is that if we want to take advantage of the x25519 speed up, transactions must contain Ke serialized in x25519 format (x-coordinate only) and not ed25519 compressed points. Decompressing an ed25519 point and converting it to x25519 would most likely be slower than just using ed25519 scalar multiplication as we currently do.

@j-berman
Copy link

j-berman commented Feb 1, 2022

@tevador any shot your code is in a shareable state so I can take a look/learn from it please :)

@UkoeHB
Copy link

UkoeHB commented Feb 1, 2022

@tevador You can cheaply convert ed25519 compressed points to x25519 like this:

int
crypto_sign_ed25519_pk_to_curve25519_remove_extra_ops(unsigned char *curve25519_pk,
                                                      const unsigned char *ed25519_pk)
{
    fe25519    original_y;
    fe25519    x;
    fe25519    one_minus_y;

    // get y coordinate of ed25519 point
    unsigned char ed25519_pk_copy[32];
    memcpy(ed25519_pk_copy, ed25519_pk, 32);
    ed25519_pk_copy[31] &= UCHAR_MAX >> 1;
    fe25519_frombytes(original_y, ed25519_pk_copy);

    // ed25519 -> curve25519
    fe25519_1(one_minus_y);
    fe25519_sub(one_minus_y, one_minus_y, original_y);
    fe25519_1(x);
    fe25519_add(x, x, original_y);
    fe25519_invert(one_minus_y, one_minus_y);
    fe25519_mul(x, x, one_minus_y);
    fe25519_tobytes(curve25519_pk, x);

    return 0;
}

@tevador
Copy link
Author

tevador commented Feb 1, 2022

fe25519_invert(one_minus_y, one_minus_y);

Field inversion is one of the slowest field operations. This is definitely not a cheap operation.

@UkoeHB
Copy link

UkoeHB commented Feb 1, 2022

Field inversion is one of the slowest field operations. This is definitely not a cheap operation.

EDIT: My point is the built-in function to convert ed25519->x25519 isn't super efficient since it fully deserializes the ed25519 point. A proper test needs this cheaper conversion function ^

@j-berman
Copy link

j-berman commented Feb 1, 2022

@tevador are you using an assembly implementation for x25519 also?

@UkoeHB
Copy link

UkoeHB commented Feb 1, 2022

@tevador The key r G baked into the amount commitment and encryption mask, I think it needs to be 8 r G.

@tevador
Copy link
Author

tevador commented Feb 1, 2022

@UkoeHB FYI, I benchmarked the crypto_sign_ed25519_pk_to_curve25519_remove_extra_ops function on my machine at ~13500 clock cycles, which is only slightly faster than full ed25519 point decompression taking ~15000 cycles.

@j-berman This is the x25519 code I'm benchmarking:

static void x25519_scalarmult(fe25519* xs, const uint8_t scalar[32], const fe25519* xb) {
    fe25519 zs, xt, zt, t0, t1;
    uint8_t swap = 0;

    fe25519_setint(xs, 1);
    fe25519_setint(&zs, 0);
    xt = *xb;
    fe25519_setint(&zt, 1);

    for (int pos = 254; pos >= 0; --pos) {
        uint8_t b = 1 & (scalar[pos / 8] >> (pos & 7));

        swap ^= b;
        fe25519_cswap(xs, &xt, swap);
        fe25519_cswap(&zs, &zt, swap);
        swap = b;
        fe25519_sub(&t0, &xt, &zt);
        fe25519_sub(&t1, xs, &zs);
        fe25519_add(xs, xs, &zs);
        fe25519_add(&zs, &xt, &zt);
        fe25519_mul(&zt, xs, &t0);
        fe25519_mul(&zs, &zs, &t1);
        fe25519_square(&t0, &t1);
        fe25519_square(&t1, xs);
        fe25519_add(&xt, &zt, &zs);
        fe25519_sub(&zs, &zt, &zs);
        fe25519_mul(xs, &t1, &t0);
        fe25519_sub(&t1, &t1, &t0);
        fe25519_square(&zs, &zs);
        fe25519_mul121666(&zt, &t1);
        fe25519_square(&xt, &xt);
        fe25519_add(&t0, &t0, &zt);
        fe25519_mul(&zt, xb, &zs);
        fe25519_mul(&zs, &t1, &t0);
    }

    fe25519_invert(&zs, &zs);
    fe25519_mul(xs, xs, &zs);
}

The functions fe25519_add, fe25519_sub, fe25519_mul, fe25519_square and fe25519_mul121666 are all written in assembly. Some small speed up should be possible by inlining everything.

@tevador
Copy link
Author

tevador commented Feb 2, 2022

Here are some thoughts on optimizing the view tag calculation. I'm writing it here rather than in the view tag PR since the decision to use keccak seems to be final there.

I benchmarked keccak at ~3200 cycles per hash on my machine. This means there is an additional ~2% speed up possible by replacing keccak with something faster.

I am going to change the view tag computation (for Jamtis). From: v = H1("view tag", Kd). To: v = H1("view tag", Kd, Ko). This fixes an edge case Janus attack.

We don't actually need to hash the output key. It should be enough to hash the encrypted address tag t~, which is always different for change outputs.

Grin's PoW uses a Siphash-2-4 variant that accepts a 256-bit key and only hashes a 64-bit nonce. So we could modify the view tag calculation as: v = siphash24(t~, Kd) & 0xff using the shared secret as the 256-bit siphash key directly. I benchmarked that this takes just 34 cycles on my Ryzen 3700X. We could even use siphash-4-8, which takes 65 cycles.

Someone might argue against using a finite field element directly as a symmetric key due to its non-uniform distribution, but I think it's irrelevant in this case. The shared secret still contains about 254 bits of information: 1 bit is always zero because the size of the field is only 255 bits (this is for x25519) and another bit is removed because the field element plugged into an equation must produce a quadratic residue modulo 2255-19. A 254-bit secret key is plenty and siphash already mixes all key bits into the result, taking care of any non-uniformity.

Overall, v = siphash48(t~, Kd) & 0xff seems like a reasonably conservative approach that's still 50x faster than keccak.

This together with the move to x25519 for DH would result in a noticeable ~15% speed up.

Here is the full siphash48 source code I used for my bechmarks (you can get siphash24 by removing half of the siprounds):

#include <stdint.h>

#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))

#define U8TO64_LE(p)                                                           \
    (((uint64_t)((p)[0])) | ((uint64_t)((p)[1]) << 8) |                        \
     ((uint64_t)((p)[2]) << 16) | ((uint64_t)((p)[3]) << 24) |                 \
     ((uint64_t)((p)[4]) << 32) | ((uint64_t)((p)[5]) << 40) |                 \
     ((uint64_t)((p)[6]) << 48) | ((uint64_t)((p)[7]) << 56))

#define SIPROUND                                                               \
    do {                                                                       \
        v0 += v1;                                                              \
        v1 = ROTL(v1, 13);                                                     \
        v1 ^= v0;                                                              \
        v0 = ROTL(v0, 32);                                                     \
        v2 += v3;                                                              \
        v3 = ROTL(v3, 16);                                                     \
        v3 ^= v2;                                                              \
        v0 += v3;                                                              \
        v3 = ROTL(v3, 21);                                                     \
        v3 ^= v0;                                                              \
        v2 += v1;                                                              \
        v1 = ROTL(v1, 17);                                                     \
        v1 ^= v2;                                                              \
        v2 = ROTL(v2, 32);                                                     \
    } while (0)

uint64_t siphash48(const uint64_t in, const uint8_t key[32]) {
    uint64_t v0 = UINT64_C(0x736f6d6570736575);
    uint64_t v1 = UINT64_C(0x646f72616e646f6d);
    uint64_t v2 = UINT64_C(0x6c7967656e657261);
    uint64_t v3 = UINT64_C(0x7465646279746573);
    
    v3 ^= U8TO64_LE(key + 0);
    v2 ^= U8TO64_LE(key + 8);
    v1 ^= U8TO64_LE(key + 16);
    v0 ^= U8TO64_LE(key + 24);

    v3 ^= in;

    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;

    v0 ^= in;
    v2 ^= 0xff;

    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;
    SIPROUND;

    return (v0 ^ v1) ^ (v2 ^ v3);
}

@UkoeHB
Copy link

UkoeHB commented Feb 2, 2022

We don't actually need to hash the output key. It should be enough to hash the encrypted address tag t~, which is always different for change outputs.

You're right that practically speaking enc_addr_tag could also work. I would feel a bit more confident tying the view tag to a core part of the output structure (instead of what is basically a memo field). Is there a meaningful disadvantage to hashing Ko?

I also think it needs a domain separator.

@tevador
Copy link
Author

tevador commented Feb 2, 2022

I would feel a bit more confident tying the view tag to a core part of the output structure (instead of what is basically a memo field).

The view tag itself is basically a memo field and its value is irrelevant to consensus. I don't think we gain anything by tying it to the output key. Or do you have some specific kind of attack in mind?

Is there a meaningful disadvantage to hashing Ko?

Mainly a performance reason as the output key is 4 times larger. I guess this is only relevant for siphash. Keccak has sufficiently large block size to accept more input without a performance impact.

I also think it needs a domain separator.

That's easy. We can use our own set of initial values for v0-v3 (the default is "somepseudorandomlygeneratedbytes").

@UkoeHB
Copy link

UkoeHB commented Feb 2, 2022

The view tag itself is basically a memo field and its value is irrelevant to consensus. I don't think we gain anything by tying it to the output key. Or do you have some specific kind of attack in mind?

Generally you make a hash dependency directly on things upstream of your object, not in a separate branch (address tags and view tags are separate). This ensures efficient coupling (the onetime address is the most 'stable' part of the output structure - least likely to change) and makes reasoning about the dependency structure more straightforward (no need to add an extra clause 'we chose this over this because of some tiny perf win'). In this case, you'd have to justify that the reduced cycle count from a smaller hash input is worth a non-standard coupling relationship (it will be < 0.2% perf difference I'm guessing).

@tevador
Copy link
Author

tevador commented Feb 2, 2022

Generally you make a hash dependency directly on things upstream of your object, not in a separate branch (address tags and view tags are separate)

Output keys, view tags and encrypted address tags are all "siblings" derived from the shared secret. I wouldn't say the output key is "upstream" of the view tag. The purpose of the view tag is to tell a potential recipient that the shared secret is probably correct.

the onetime address is the most 'stable' part of the output structure - least likely to change

I would argue that the address tags are equally unlikely to change since that would require a new addressing scheme. The most stable part of the output structure is the amount commitment, which is the only field that is common between RCT and Seraphis (and actually must stay the same forever otherwise you could never prove that transactions using old inputs balance out).

In this case, you'd have to justify that the reduced cycle count from a smaller hash input is worth a non-standard coupling relationship

Higher performance and simpler code would be two reasons to use the address tag rather the output key. AFAICS there are no reasons to use the output key or the amount commitment. The most straightforward structure would be to hash the shared secret and the output index (this is done in the current Monero PR). Since we don't want output indices, the next best thing is the address index. Seems straightforward to me.

@UkoeHB
Copy link

UkoeHB commented Feb 2, 2022

Output keys, view tags and encrypted address tags are all "siblings" derived from the shared secret. I wouldn't say the output key is "upstream" of the view tag. The purpose of the view tag is to tell a potential recipient that the shared secret is probably correct.

I don't think this is quite right. Both view tags and encrypted address tags are children of the onetime address.

  1. The view tag indicates if 'you might own the output' (ownership is 'have privkeys of the onetime address').
  2. Encrypted address tags tell you 'how to try to open the onetime address'.
  3. The shared secret is just an intermediate step toward opening up the onetime address, used to hide who has the right to open it from observers.

I would argue that the address tags are equally unlikely to change since that would require a new addressing scheme.

The core part of outputs remains the same between Seraphis and RingCT: a onetime address and an amount commitment. The internal structures of those points are opaque in this context.

If the RingCT view tag proposal were compatible with our needs here (which is theoretically possible [just do v = Siphash[16_bytes(K_d)]("xmr view tag", Ko), where K_d and K_o are opaque])- although practically speaking won't happen), then we wouldn't need to add any code for this. Tying view tags to the address index also ties them to the address scheme.

EDIT:

Higher performance and simpler code would be two reasons to use the address tag rather the output key.

Also, I think a custom implementation of Siphash is 'less simple' even if there are fewer LoC, because validating it is more difficult than checking if you copy pasted from a known implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment