Skip to content

Instantly share code, notes, and snippets.

@tevador

tevador/minglejingle.md

Last active Apr 14, 2021
Embed
What would you like to do?

Minglejingle: Scalable blockchain with non-interactive transactions

This article describes the Minglejingle (MJ) protocol: a redesign of Mimblewimble (MW) for non-interactive transactions. It preserves the security and privacy properties, supports payment proofs, non-interactive coinjoin and cut-through of spent outputs.

1. Introduction

Blockchains are distributed ledgers that preserve the transaction history so that new network participants can, at any point in the future, verify these two security properties:

  1. Supply security: No counterfeit coins have been created.
  2. Ownership security: No coins have been moved without the authorization by the owner of the associated private keys.

Bitcoin achieves unconditional supply security as a result of transparent transaction amounts. Ownership security is achieved by storing all historical transactions forever, which allows the signatures authorizing the transfer of coins to be verified at any time.

The Mimblewimble protocol, proposed in 2016 by an anonymous researcher [1], makes blockchains much more efficient, while still providing both of the security properties. MW replaces output values with homomorphic Pedersen commitments, which allows both the supply security and the ownership security to simplify to a single signature verification per transaction while removing the need to keep all historical ouputs in the blockchain.

However, the efficiency and elegance of MW comes at a cost. All transactions must be constructed interactively by the sender and all the receivers. This has both usability and security implications:

  1. The sender and the receivers must typically be online at the same time and be able to communicate. Since most users connected to the Internet don't have a public IP address, the interaction is usually done via Tor or using a third party service.
  2. The private keys are needed in order to receive funds. This means that the recipient must typically keep their private keys in a hot wallet connected to the internet. This poses an additional security risk.

There have been many proposals how to alleviate these issues and enable non-interactive/one-sided transactons in the MW protocol. However, these proposals either don't support payment proofs [2] or don't provide ownership security [3, 4].

1.1 Contribution

This article presents Minglejingle, a redesign of the MW protocol with the following features:

  1. Transactions are fully non-interactive. The sender only needs to know the recipient's address, which consists of 2 public keys.
  2. Addresses are not linkable to wallets.
  3. Transactions outputs are not linkable to addresses.
  4. Blocks don't link transaction inputs to outputs.
  5. The protocol provides both supply security and ownership security.
  6. Spent outputs are not needed for chain verification.
  7. The protocol provides unconditional payment proofs.

As MJ represents a compromise between efficiency and usability, there are some drawbacks:

  1. The protocol requires more complex consensus rules than MW.
  2. The amount of data that new nodes need to download in order to verify the full transaction history is 2-3 times larger than in MW, depending on the fraction of spent outputs. However, this is still several times less than other non-interactive protocols with similar privacy properties (details in §6.1).

2. Cryptographic primitives

2.1 Notation

We assume that 𝔾 is a cyclic group of prime order q. Uppercase letters usually refer to group elements (public keys, commitments) and lowercase letters usually refer to numbers in Z(q) (scalars, private keys). We will use the additive notation for group operations.

Let G be the generator of 𝔾 and H be another element of 𝔾 with unknown discrete logarithm relationship to G.

2.2 Hash functions

We assume the existence of three hash functions:

  • Hd is a hash function {0,1}* -> {0,1}2*λ ("hash-to-digest"), where λ is the security level in bits
  • Hs is a hash function {0,1}* -> Z(q) ("hash-to-scalar")
  • Hp is a hash function {0,1}* -> 𝔾 ("hash-to-point")

These hash functions are modeled as random oracles with uniform outputs over their domains.

2.2.1 Tagging

Tags are used to prevent the outputs of hash functions from being misused in a different context than intended. Tags will be denoted by capital letter T with a lower index specifying the name of the tag. Tags are passed to hash functions alongside other input fields. For example, Tsend is a tag specifies that the input is used in the context of sending funds.

2.3 Pedersen commitments

Pedersen commitment C is an element of 𝔾 constructed as:

C = r * G + v * H

where r is the blinding factor and v is the value of the commitment. Pedersen commitments are homomorphic, which means that adding two commitments C1 and C2 to values v1 and v2 produces a commitment to (v1 + v2) mod q

2.4 Range proofs

When adding Pedersen commitments, their values are added modulo q. To prevent overflow, we need to restrict the possible values of v to a range much smaller than q. Range proofs are a special kind of zero-knowledge proofs that prove a commitment C is of the form r*G + v*H where 0 <= v < 2n and n is the required precision in bits, without revealing the values r or v. For monetary operations, 64-bit precision is more than enough to represent the possible range of values.

This article assumes the use of Bulletproofs+, a short and efficient range proof that requires 15 group elements and 3 scalars to prove that v is a 64-bit value. [5]. The proofs for several different commitments can be aggregated with just 2 additional group elements per commitment.

2.5 Diffie-Hellman key exchange

If Alice owns a keypair (xa, Pa) and Bob owns a key pair (xb, Pb), then xa* Pb = xb* Pa is a secret that only Alice and Bob can compute after sharing their public keys Pa and Pb. This is called the Diffie-Helmann (DH) key exchange protocol. [6]

2.6 Schnorr signatures

Schnorr signature is a digital signature scheme [7] that works as follows:

2.6.1 Signing

To sign a message m using the private key x (corresponding to the public key P), it is first necessary to select a random nonce value r and to calculate R = r*G. The signature then consists of the pair (R,s), where s = r + x*Hs(R, P, m).

2.6.2 Verification

A signature (R,s) of the message m can be verified with the public key P by checking s*G ?= R + Hs(R, P, m)*P

2.6.3 Interactive multi-signatures

Schnorr signatures of the same message with two or more private keys can be aggregated into one multi-signature (R,s) if the signers cooperate. This interactive protocol is described in §3.2.2 as part of the MW protocol.

2.6.4 Non-interactive half-aggregation

Two Schnorr signatures (R1, s1), (R2, s2) of two different messages m1, m2 with two different public keys P1, P2 can be partially aggregated into one signature (R1, R2, s) with s = s1 + y*s2, where y = Hs(Tagg, R1, R2, P1, P2, m1, m2) (or an equivalent random oracle output). The scheme can be easily extended for any number of signatures.

2.7 Elliptic curve choice

For the 128-bit security level, a good choice to represent the group 𝔾 is the ed25519 elliptic curve. It is a twisted Edwards curve that has complete point addition formulas, allowing fast and secure constant-time implementations. Although this curve has a cofactor of 8, this can be eliminated by using the Ristretto abstraction for group elements, which produces a prime-order group and protects from small-subgroup attacks [8].

2.7.1 Sizes

The table below lists the serialized sizes of the cryptographic primitives when using the ed25519 elliptic curve.

primitive elements size (bytes)
private key, scalar Z(q) 32
public key, commitment 𝔾 32
range proof (BP+) 15x 𝔾, 3x Z(q) 576
2x aggregated BP+ 17x 𝔾, 3x Z(q) 640
Schnorr signature 1x 𝔾, 1x Z(q) 64

3. Baseline protocols

3.1 Bitcoin with CT (BCT)

For comparison, we show what a simplified Bitcoin protocol might look like with Confidential Transactions (CT), i.e. output amounts hidden with Pedersen commitments.

3.1.1 Transaction inputs

In order to spend an output, the sender must add the following data to the blockchain:

  • The ID of the transaction and the index of the output that's being spent
  • A Schnorr signature (R, s) showing the possesion of the private key for the output
  • The difference d between the blinding factors of the outputs and the blinding factors of the inputs.

If multiple inputs are spent at once, the signatures become multi-signatures (§2.6.3), so two inputs require 168 bytes of data.

3.1.2 Transaction outputs

The following data needs to be stored in the BCT blockchain for each output:

  • The output commitment C = r*G + v*H
  • A range proof showing that 0 <= v < 264
  • The receiver's public key Kr
  • The sender's emphemeral public key Ke for DH key exchange
  • The value v encrypted with the shared key between the sender and the receiver. The blinding factor r can be derived deterministically from the shared secret.

If multiple outputs are created at once, the range proofs can be aggregated.

3.1.3 Verification

Each BCT transaction can be verified in 2 steps:

  1. Checking that the difference between the output commitments and the input commitments is equal to d*G. This proves that the transaction values balance out.
  2. Checking that (R, s) is a valid Schnorr multi-signature with the input public keys.

3.1.4 Privacy

The only privacy feature provided by BCT is that the transaction amounts are hidden.

3.1.5 Payment proofs

Since transactions are stored on the blockchain forever, payment proofs are easy. The sender only needs to point to a transaction output and provide the private key for Ke to reveal the value of the output.

3.2 Mimblewimble

Our main baseline is the interactive MW protocol.

3.2.1 Transactions

Each MW transaction consists of:

  • One or more inputs, which are just Pedersen commitments and refer to unspent outputs of previous transactions.
  • One or more outputs, each consisting of a Pedersen commitment and a range proof.
  • A transaction kernel, which consists of a public key K and an associated Schnorr signature (R, s).
  • An offset o, which can be aggregated.

3.2.2 Interactive protocol

If Carol, who owns an unspent output Ci = ki*G + vi*H wants to send vo coins to Dave, the following interactive protocol must be executed:

  1. Carol calculates the change output value vc = vi - vo, constructs the change output Cc = kc*G + vc*H, selects a nonce rc and offset oc, then sends Ci, Cc, vo, oc and Rc = rc*G to Dave.
  2. Dave constructs his output Co = ko*G + vo*H, constructs a range proof for Co, selects a nonce rd and offset od, calculates the kernel public key K = Co + Cc - Ci - (oc + od)*G, calculates the aggregated nonce R = Rc + rd*G and constructs a partial signature sd = rd + (ko-od)*Hs(R,K), then sends Co, the range proof, Rd = rd*G, od and sd back to Carol.
  3. Carol completes the signature as s = sd + rc + (kc-ki-oc)*Hs(R,K) and can finally construct the transaction consisting of the input Ci, the outputs Co and Cc with the associated range proofs, the kernel K, kernel signature (R,s) and the offset o = oc + od.

3.2.3 Verification

The resulting transaction can be easily validated by checking:

  1. Co + Cc - Ci ?= K + o*G
  2. s*G ?= R + K*Hs(R,K)

The signature verification in step 2 serves both as a balance proof (K doesn't contain any coins) and the proof of ownership.

3.2.4 Cut-through

The MW protocol allows for the removal of spent outputs and the inputs of all transactions while preserving the security properties. Each historical transaction only leaves behind the kernel, which is about 96 bytes at the 128-bit securit level.

3.2.5 Efficiency analysis

The main points that make MW so efficient are the following:

  • The use of the blinding factor to both obscure the values of the outputs and to authorize the spending.
  • Unspent outputs consisting of the absolute minimum amount of data (only a commitment + range proof).
  • The removal of spent outputs.
  • The kernel signature being an aggregated signature over all of the involved private keys.

3.2.6 Privacy

MW is more private than BCT. Since transaction inputs and outputs are only linked indirectly via the kernel public key and the kernel signature doesn't sign any output data, MW provides "accidental" privacy by enabling non-interactive coinjoin.

3.2.7 Payment proof

The MW blockchain doesn't include any provisions for payment proofs. If the Carol requires a payment proof, it must be provided by Dave in step 2 of the interactive protocol. The validity of the payment proof must be bound to the existence of the transaction kernel on the blockchain to resolve disputes in the case Carol doesn't complete step 3 or if Dave spends the output.

4. Towards a non-interactive protocol

4.1 Goals

The main design goals are the following:

  1. Fully non-interactive transactions.
  2. Both supply security and ownership security .
  3. Unconditional payment proofs.
  4. Privacy equivalent to the interactive MW protocol.
  5. Smallest possible blockchain.
  6. Fastest possible verification.

4.2 Simulating the receiver

For Carol to be able to construct a MW-like transaction non-interactively, she must be able to derive all values from step 2 of §3.2.2 that are provided by Dave in the interactive protocol. This can be achieved using the DH key exchange. The shared DH key can be used for the following:

  1. Generating the blinding factor rd for Dave's output commitment. The blinding factor must be known to Carol to enable her to construct the required range proofs.
  2. Encrypting the value vo and other data associated with the payment.
Observation 1
Because the blinding factor is known by both Carol and Dave, it is no longer sufficient as a proof of ownership of the output.

In order to restrict the ability to spend the output to Dave, the transaction output needs to include an output key that only Dave knows the private key for.

4.3 Addresses

To enable non-interactive transactions, we need to introduce addresses. For privacy reasons, Dave's output received from Carol should be bound to a unique one-time public key that is cryptographically unlinkable to Dave's public address.

Additionally, we want to enable Dave to generate a virtually unlimited number of different unlinkable addresses so that noone can trace his online activities.

We can achieve both of these properties with a variant of the subaddress scheme used by Monero. [9]

4.3.1 Address generation

Dave can generate a pair of private keys a and b (in practice, both keys can be derived from the same secret seed). a is called the private view key and can be used to recognize payments. b is the private spend key and is needed to spend outputs. Separating the viewing and spending ability allows the private view key to be revealed to an auditor without risking a loss of funds.

Whenever Dave needs an address, he can select an integer index i and generate the associated public address (Ai, Bi) as:

mi = Hs(Taddr, a, i)

Ai = a*(mi + b)*G

Bi = (mi + b)*G

Notice that Ai = a*Bi for any index i. This allows Dave to calculate a DH shared secret using the same private key a for all his addresses.

4.3.2 Output construction

Let's suppose that Carol and Dave have agreed in advance on an amount v to be sent and Dave has provided his public address (A, B) constructed according to §4.3.1 for some index i unknown to Carol.

  1. Carol can begin constructing the output for Dave by selecting a 128-bit nonce n uniformly at random. We will later explain how this nonce can be used for payment proof.
  2. Carol then calculates a unique sending key s = Hs(Tsend, A, B, v, n)
  3. Carol derives a shared secret t = Hd(Tderive, s*A).
  4. Carol constructs a one-time public key for Dave: Ko = Hs(Toutkey, t)*G + B. Only Dave knows the corresponding private key.
  5. Carol feeds the shared secret t into a stream cipher to derive a blinding factor r and two encryption masks mv and mn.

Now Carol has all the pieces needed to form an MJ output:

field description size
Co=r*G+v*H amount commitment 32
range proof for Co 576
Ko one-time output key 32
Ke=s*B key exchange public key 32
t[0] view tag 1
v ⊕ mv encrypted amount 8
n ⊕ mn encrypted nonce 16
total 697

The "view tag" is the first byte of the shared secret t. This idea was first proposed for Monero and can reduce the time to test output ownership by at least 15% [10].

4.3.3 Output recognition

Dave can do the following for each unspent output in the blockchain:

  1. Derive the nominal shared secret t' = Hd(Tderive, a*Ke)
  2. If t'[0] != t[0], this output is not owned by Dave.
  3. Calculate the nominal spend key B = Ko - Hs(Toutkey, t)*G
  4. If B is not equal to any of Dave's generated spend keys, this output is not his.
  5. Feed the shared secret t into a stream cipher to derive the blinding factor r and two decryption masks mv and mn to decrypt the amount v and nonce n.
  6. Check that Co ?= r*G + v*H
  7. Look up the public view key A associated with B.
  8. Calculate Carol's sending key: s = Hs(Tsend, A, B, v, n)
  9. Check that s*B ?= Ke

Note that none of these steps requires the private spend key b, so they can be done by an auditor on behalf of Dave. 99.6% of non-owned outputs will abort early in step 2 thanks to the view tag. Steps 7-9 are needed in order to prevent the Janus attack that can be used to link addresses that have the same private view key [11]. If the final check fails, Dave should not confirm the receipt of the payment and should not spend the output.

4.4 Ownership security after cut-through

To achieve an efficient blockchain, we'd like to cut-through spent outputs. However, transaction kernels don't prove ownership anymore because the blinding factors are shared between the sender and the receiver. Therefore, we need to retain some witness data for each spent output so that:

  1. Ownership security can be retrospectively verified.
  2. A payment proof can be constructed even after off-chain cut-through.
Observation 2
For each spent input, the input key Ki and the associated signature must be kept in the blockchain forever.

It would be nice to save blockchain space and put one aggregated signature with all the input keys into the transaction kernel. Unfortunately, this is not possible to do securely without explicitly linking inputs to the kernel (and thus revealing common input ownership). Since any of the Ki values may be rogue keys, the keys and signatures would have to be aggregated in a way that commits to all of the input keys in advance. The reader should refer to the Bellare-Neven multisignature scheme for details [12]. Note that MW doesn't have this limitation because the range proofs show that none of the output commitments is a rogue key, but this cannot be done non-interactively.

The best we can do is to keep Ki and a half-aggregated Schnorr signature separately for each transaction input.

However, this still has 2 problems:

  1. What message should the input keys be used to sign? We cannot sign the outputs, since they are going to be cut-through after being spent (additionally, this would create an explicit link from inputs to outputs). If we sign a static message, this makes the transaction malleable and any malicious network participant can swap the transaction outputs for their own.
  2. How can future verifiers be convinced that the key Ki in the list of spent inputs is indeed the key that was in the cut-through output?

This leads to the following observations:

Observation 3
The signatures with the input keys Ki must be cryptographically bound to the newly created outputs.
Observation 4
To enable cut-through, the signatures with the input keys Ki must be verifiable without access to the original transaction outputs.

We can achieve this without introducing linkability by homomorphically binding the spent inputs to the kernel and the kernel to the transaction outputs.

4.4.1 Homomorphic hash commitments

Given a transaction output outj, we can define an output identifier IDj = Hd(Tout, outj) with the corresponding output key Kj.

We can bind the kernel to all transaction outputs by introducing a new field in the kernel, the output hash-commitment Z:

Z = Hp(ID1, K1) + Hp(ID2, K2) + ... + oZ*G

where + is addition in group 𝔾 and oZ is a random offset. Note that unlike the sum of ordinary hashes modulo 2256, the Wagner's algorithm cannot be used to generate a set of valid outputs given a commitment Z, because it would require solving the discrete logarithm problem in 𝔾 [13].

The homomorphic property of the hash commitment delinks the kernel from the outputs. When two transactions are coinjoined, the offsets oZ are added together modulo q and it is no longer possible to infer which kernels commit to which outputs, but the sum of the kernels will still commit to all of the outputs. The same idea is used by the Elliptic Curve Multiset Hash [14].

The commitment Z can be signed by the kernel public key K to make transactions non-malleable.

4.4.2 Binding transaction inputs to the kernel

In MW, the kernel public key is defined as:

KMW = Co1 + Co2 + ... - Ci1 - Ci2 - ... - oK*G

where Co are the output commitments, Ci the input commitments and oK an offset.

In MJ, we have the kernel and a set of input Schnorr signatures (Ri1, si1), (Ri2, si2), ... with the input public keys K1, K2, ...

We can bind the inputs to the kernel by defining the kernel public key as:

KMJ = KMW - Ri1 - Ri2 - ...

Carol can still sign with this public key because she knows the private keys of all of the nonces Ri. The balance equation still holds because all of the nonces have the form Ri = ri*G, which is proved by the input Schnorr signatures.

4.4.3 Verifiable input signatures

We can achieve verifiability by definining a spent input as a tuple (IDi, Ki, Ri, si), where (Ri, si) is a Schnorr signature of IDi with the input key Ki. The si values can be aggregated on a per-block basis, so each input will asymptotically require only 96 bytes of data to be kept in the blockchain.

Thanks to the homomorphic output commitment Z in each transaction kernel, verifiers can be convinced that all spent inputs correspond to previously created outputs by checking:

ΣKERN Zk ?= ΣTXI Hp(IDi, Ki) + ΣUTXO Hp(IDo, Ko) + oZ*G

where ΣKERN is a sum over all transaction kernels, ΣTXI sum over all transaction inputs and ΣUTXO sum over all unspent transaction outputs. Notice that checking this equation doesn't require the access to spent historical outputs.

4.5 Minglejingle transaction

An MJ transaction consists of:

  • A set of inputs, with each input being the tuple (IDi, Ki, Ri, si), where IDi may refer to the hash of an unspent output in the blockchain.
  • A transaction kernel consisting of the kernel public key K, the hash commitment Z and a signature (RK, sK) of the value Z.
  • One or more transaction outputs. The output format is described in §4.3.2.
  • Two offsets oK and oZ.

The two offsets and the values of si can be aggregated for the whole block, so the size of a 2-in 2-out transaction is 1714 bytes. A spent transaction will leave 320 bytes in the blockchain (128 bytes for the kernel and 96 bytes per input).

4.6 Validation rules

In order to verify the whole transaction history, a node needs:

  • All block headers.
  • The total money supply μ. This can be implicit from the block height or derived from the block headers.
  • The list of spent inputs (IDi, Ki, Ri) for each block.
  • All transaction kernels (K, Z, RK, sK)
  • All unspent outputs.

Compared to Bitcoin, spent outputs are not needed for validation.

The five validation rules are the following:

  1. The aggregated signature sagg in each block is valid for all the spent inputs in that block.
  2. All kernel signatures are valid.
  3. The supply balance holds: ΣUTXO Co ?= μ*H + ΣKERN Kk + ΣTXI Ri + oK*G
  4. The input-output balance holds: ΣKERN Zk ?= ΣTXI Hp(IDi, Ki) + ΣUTXO Hp(IDo, Ko) + oZ*G
  5. All unspent outputs have valid range proofs.

Similar rules apply for mempool transaction validation except the input signatures are not aggregated yet and the μ*H term is replaced with the sum of the spent input commitments ΣTXI Ci. The values of Ci can be looked up in the list of UTXOs based on IDi. Note that some values of IDi may refer to non-existing inputs, which indicates that off-chain cut-through has occured. Such inputs count as having a value of 0 (but they still must have valid signatures).

4.7 Payment proof

To prove that she sent money to Dave, Carol can provide an arbiter with:

  • Dave's address (A, B)
  • The transaction amount v
  • The one-time nonce n

The arbiter can verify Carol's claim by:

  1. Calculating the sending key s = Hs(Tsend, A, B, v, n)
  2. Deriving the shared secret t = Hd(Tderive, s*A).
  3. Constructing the corresponding one-time public key for Dave: Kd = Hs(Toutkey, t)*G + B

The arbiter can then check if an input or an output exists in the blockchain with the key Kd.

4.7.1 A spent input exists

If a spent input exists with this key, the arbiter is convinced that the payment has taken place. Dave's signature with the key Kd indisputably proves that Carol's output was correctly formed.

4.7.2 An unspent output exists

If an unspent output exists with the output key Kd, additional checks are needed to ensure that the output is correctly formed. The arbiter can follow the output construction procedure from §4.3.2 and validate the output. If the output is malformed (e.g. the amount commitment doesn't have the correct value, the view tag or the encrypted nonce are invalid etc.), blame can be assigned to Carol.

5. Security analysis

5.1 Linking addresses to a wallet

Given two addresses (Ai, Bi), (Aj, Bj) generated according to §4.3.1, deciding whether they belong to the same wallet requires solving the decisional Diffie-Hellman problem in 𝔾 [15]. This problem is considered intractable if the used elliptic curve has a large embedding degree, as is the case for most non-pairing-friendly curves such as ed25519.

An active attack may be attempted by Mallory if she suspects that (Ai, Bi), (Aj, Bj) are both owned by Dave. This requires her to deviate from the output construction process described in §4.3.2 by calculating the shared secret as t = Hd(Tderive, s*Ai), setting the key exchange public key to Ke=s*Bi and the output key to Kd = Hs(Toutkey, t)*G + Bj. However, Mallory won't be able to provide a nonce value n that passes the check in step 9 of §4.3.3, so Dave will be able to detect this attack and won't confirm the payment to Mallory. Mallory will not be able to provide a payment proof because in fact, she created an invalid output to address (Ai, Bj).

5.2 Linking outputs to an address

Linking an output to an address requires either solving the discrete logarithm problem by calculating the receiver's private view key a or by guessing the sender's nonce value n (assuming the amount is known to the attacker). Both of these problems are intractable.

5.3 Breaking supply security

There are three approaches to breaking the balance equation §4.6.3:

  1. Creating a valid Schnorr signature using H as the public key
  2. Creating a valid Schnorr signature using H as the challenge nonce
  3. Creating a valid range proof for a negative value.

Approaches 1. and 2. require solving the DLP in 𝔾, which is intractable. The third approach is intractable if the range proof is sound.

5.4 Breaking ownership security

There are three approaches for stealing an unspent output (IDo, Ko), assuming the attacker knows the blinding factor of the output amount Co:

  1. Forging a signature with the key Ko.
  2. Calculating the discrete logarithm of Hp(IDo, Ko), replacing the output with (IDattack, Kattack), adjusting the blockchain offset oZ and performing a 51% attack to reorganize the blockchain beyond the cut-through horizon.
  3. Generating an output (IDattack, Kattack) such that Hp(IDattack, Kattack) = Hp(IDo, Ko), replacing the original output and performing a 51% attack to reorganize the blockchain beyond the cut-through horizon.

All of these approaches involve solving intractable problems (the second and third are also impractical).

5.5 Replay attacks

MJ is susceptible to the following replay attack:

  1. Mallory sends an output (ID1, K1) to Dave.
  2. Some time later, Dave spends the output in transaction TX2.
  3. Mallory again sends an identical output (ID1, K1) to Dave. This requires Mallory to use the same sender nonce and amount as before.
  4. Anyone can replay the transaction TX2 to spend this output the same way Dave spent the first copy of the output.

A similar attack applies to Mimblewimble [16]. Solving this attack may not require consensus changes because Dave will be able to detect the duplicate output and won't provide any goods or services to Mallory for the payment. If Mallory tries to dispute the payment, the arbiter will clearly see the duplicate output keys in the blockchain, which can only occur in one of two cases (besides a negligible chance of a collision):

  1. Mallory deliberately sent the duplicate payment.
  2. Mallory is a victim of a longer chain of replayed transactions.

In the first case, the fault lies with Mallory. In the second case, someone else caused the replay attack, in which case Mallory actually didn't make the payment. In both cases, the arbiter may conclude that Dave is not obligated to provide any goods or services to Mallory.

5.6 Forging payment proofs

Given an output key Ko in the blockchain, Mallory can try to forge a payment proof with a value of v to the address (A,B):

  1. If Mallory knows the private keys for Ko and B, she can forge the proof by calculating a hash preimage of ko - b, another hash preimage to get the shared key, solving for the discrete logarithm of the Diffie-Hellman point with respect to A and finally, the last hash preimage to get the sending nonce n.
  2. If Mallory doesn't know the private key for Ko or B, forging the proof requires one additional discrete logarithm calculation.

In both cases, forging the proof involves solving intractable problems.

6. Efficiency and usability analysis

6.1 Initial blockchain download size

The table below lists the initial blockchain download (IBD) size of MJ in comparison with other protocols at the 128-bit security level. This comparison assumes a blockchain with 600 million transactions with 2 inputs and 2 outputs each and 70 million unspent transaction outputs. For simplicity, we only count bare transaction sizes without block headers, transaction headers, explicit fees, lock times and aggregatable fields.

protocol 2-2 TX size STXO size IBD size (GB) IBD size (visual)
Mimblewimble 1376 -640 100 **
Bitcoin 280 168 ***
Minglejingle 1714 -697 241 *****
Bitcoin with CT 1016 610 ************
Monero 1800 1080 **********************

6.2 Verification performance

The following table lists the number of Schnorr signatures and BP+ range proofs (in millions) that need to be verified for full IBD validation. Note that performance-wise, a range proof verification is equivalent to about 100 signatures. Protocols are ordered in ascending order of computational requirements.

protocol signatures rangeproofs total CPU (visual)
Bitcoin 600 0 .
Bitcoin with CT 600 <701 *****
Mimblewimble 600 70 ******
Minglejingle 1800 70 *******
Monero 480002 600 ***************************************************...***************

1 Range proofs are aggregated and it's only necessary to verify the range proof of transactions with at least one unspent output.

2 Assuming CLSAG with a ring size of 11 counting as 40 Schnorr signatures based on available benchmarks.

6.3 Usability and privacy

protocol NITX NICJ CT BLINK TLINK
Bitcoin X
Mimblewimble X X X
Minglejingle X X X X
Bitcoin with CT X X
Monero X X X X
  • NITX: non-interactive transactions
  • NICJ: non-interactive coinjoin
  • CT: hidden transaction amounts
  • BLINK: blocks don't link inputs to outputs
  • TLINK: transactions don't link inputs to outputs

References

[1] Mimblewimble: https://github.com/mimblewimble/docs/wiki/MimbleWimble-Origin

[2] MW one-sided transactions: https://github.com/mimblewimble/grin-rfcs/blob/a2d9f25cdccf29904ef241df5b608ca3fd16ed61/text/0000-onesided.md

[3] One-sided transactions in MW: https://github.com/DavidBurkett/lips/blob/master/lip-0004.mediawiki

[4] MW non-interactive transactions: https://eprint.iacr.org/2020/1064

[5] Bulletproofs+: https://eprint.iacr.org/2020/735

[6] Diffie-Hellman: https://www.cs.utexas.edu/~shmat/courses/cs380s/dh.pdf

[7] Schnorr signatures: https://eprint.iacr.org/2012/029.pdf

[8] Ristretto: https://ristretto.group/what_is_ristretto.html

[9] Monero subaddresses: https://monerodocs.org/public-address/subaddress/

[10] Monero view tags: https://github.com/monero-project/research-lab/issues/73

[11] Janus attack: https://web.getmonero.org/2019/10/18/subaddress-janus.html

[12] Multisignatures: https://cseweb.ucsd.edu/~mihir/papers/multisignatures-ccs.pdf

[13] Wagner's algorithm https://www.math.uni-frankfurt.de/~dmst/teaching/SS2017/wagner.pdf

[14] Elliptic Curve Multiset Hash https://arxiv.org/abs/1601.06502

[15] Decisional Diffie-Hellman: https://en.wikipedia.org/wiki/Decisional_Diffie%E2%80%93Hellman_assumption

[16] Grin replay attack: https://forum.grin.mw/t/enforcing-that-all-kernels-are-different-at-consensus-level/7368

@DavidBurkett

This comment has been minimized.

Copy link

@DavidBurkett DavidBurkett commented Jan 26, 2021

Great write-up! I'm still working through this in detail, but I noticed a minor issue: Commitments and public keys are usually represented with 33 bytes, not 32.

@tevador

This comment has been minimized.

Copy link
Owner Author

@tevador tevador commented Jan 26, 2021

Commitments and public keys are usually represented with 33 bytes, not 32.

That's true for the secp256k1 curve used by bitcoin. It uses a 256-bit field, so 257 bits are need to store a compressed point (256 bits for the x coordinate and 1-bit parity of the y coordinate). That requires 33 bytes as you mentioned.

However, the ed25519 elliptic curve uses a 255-bit field, so compressed points fit into 256 bits. See for example: RFC 8032.

@DavidBurkett

This comment has been minimized.

Copy link

@DavidBurkett DavidBurkett commented Jan 26, 2021

You're right, I forgot we were dealing in ed25519. False alarm :)

@DavidBurkett

This comment has been minimized.

Copy link

@DavidBurkett DavidBurkett commented Jan 26, 2021

In section 4.3.2, Carol derives the shared secret by calculating t = Hd(Tderive, s*A). You indicate that Carol should include Ke=s*B in the output. Then in section 4.3.3, for Dave to derive the nominal shared secret, you have: t' = Hd(Tderive, a*Ke).

But that doesn't line up. t' would be Hd(Tderive, a*Ke) == Hd(Tderive, a*s*B), whereas t was Hd(Tderive, s*A) == Hd(Tderive, s*a*G).

t != t', because Hd(Tderive, a*s*B) != Hd(Tderive, s*a*G). So did you mean for the key exchange public key to be Ke=s*G instead?

@tevador

This comment has been minimized.

Copy link
Owner Author

@tevador tevador commented Jan 26, 2021

Thanks for the review.

As per §4.3.1, Dave's address (A,B) consists of the following 2 public keys:

A = a*(mi + b)*G

B = (mi + b)*G

where a, b are Dave's private keys, mi = Hs(Taddr, a, i) and i is the index of the address in Dave's wallet.

Note that Carol doesn't know the values of i, mi, a*G or b*G. This is the trick that makes different Dave's addresses unlinkable to him off-chain, which is the same property that Mimblewimble has due to not having addresses. Dave can use the private view key a to recognize payments for all his addresses.

So the shared secret calculated by Carol is s*A = s*a*(mi + b)*G and Dave calculates a*Ke = a*s*B = a*s*(mi + b)*G. These elements are equal.

@mrekucci

This comment has been minimized.

Copy link

@mrekucci mrekucci commented Jan 26, 2021

Do you plan to do a reference implementation?

@tevador

This comment has been minimized.

Copy link
Owner Author

@tevador tevador commented Jan 26, 2021

Do you plan to do a reference implementation?

The protocol needs to be peer-reviewed first. But I think anyone should be able to implement it based on the specification.

@DavidBurkett

This comment has been minimized.

Copy link

@DavidBurkett DavidBurkett commented Jan 27, 2021

As per §4.3.1, Dave's address (A,B) consists of the following 2 public keys:

A = a*(mi + b)*G
B = (mi + b)*G

Indeed it does. I'm sorry, I was skipping around a bit when reviewing, and made faulty assumptions about what A and B were (I assumed each were just a public key generated from the seed with no other relationship between the two).

This is actually really clever since it eliminates the need to check all outputs once for each stealth address you've used. Is this how Monero implements stealth addresses?

@tevador

This comment has been minimized.

Copy link
Owner Author

@tevador tevador commented Jan 27, 2021

Is this how Monero implements stealth addresses?

Yes. Monero calls them "subaddresses" (they start with 8 in base58) and they are used together with the standard "main" address (a*G, b*G), which starts with 4 in base58. However, Monero's implementation is vulnerable to the Janus attack (ref. 11), so using subaddresses may affect privacy in some situations.

The difference is that Minglejingle is not vulnerable to the Janus attack and it doesn't use the "main" address (a*G, b*G) at all, which simplifies the implementation (avoids having to signal to the sender whether to use G or B as the base point for the key exchange) and is also better for UX (only one address type).

@garyyu

This comment has been minimized.

Copy link

@garyyu garyyu commented Jan 29, 2021

A nice write-up! 👍 Here

Here is part of my comments/questions :-)

  • In "6.1 Initial blockchain download size", you count a 2-2 TX size of MW as 1376, and MJ as 1714.
    But actually a MW 2-2 TX at least has: 2*33+2*(33+675)+105=1587 bytes, and plus 32 bytes if counting the offset, and plus 3*8=24 bytes if counting the vector size field, and plus several additional bytes if counting some type flags. How do you count this 1376?

  • And for your MJ, if I understand correctly, you add Z(33 bytes) in Kernel, add K(33 bytes) in each Output, add signature(64 bytes) in each Input, add K(33 bytes) (duplicated to spending Output K) in each Input, and add O_z (32 bytes) for each transaction. So, a 2-2 TX in MJ add total 33+2*(64+33)+2*33 = 293 bytes, plus 32 bytes if counting the additional offset O_z. How do you count the 1714? (1714-1376=338).

  • And regarding the "A spent transaction will leave 320 bytes in the blockchain (128 bytes for the kernel and 96 bytes per input)." in "4.5 Minglejingle transaction", 128 looks missing the count of transaction fee element?

  • About the "2.6.4 Non-interactive half-aggregation" and "The two offsets and the values of s_i can be aggregated for the whole block", Does the "half-aggregation" and the "aggregation" have the same security assumption? It's appreciated to give the reference if the half-aggregation has the related paper or proofs.

  • About the STXO size. You list -640 for MW and -697 for MJ. But in my view, it should be 708=33+675 for MW and 33 more for MJ. Could you please clarify these numbers?

@tevador

This comment has been minimized.

Copy link
Owner Author

@tevador tevador commented Jan 29, 2021

Thanks for the comments!

1

In "6.1 Initial blockchain download size", you count a 2-2 TX size of MW as 1376, and MJ as 1714.

1376 for MW consists of 2 inputs (2*32 bytes), 1 kernel (96 bytes) and 2 outputs (2*608 bytes). Note that I'm assuming the use of the ed25519 elliptic curve (see this comment), so group elements are serialized as 32 bytes. For the range proof, I'm assming the use of Bulletproofs+, which require 576 bytes for a single proof (see ref. 5).

As for fees, offsets, flags etc., these are not counted in the IBD calculations as mentioned in §6.1:

For simplicity, we only count bare transaction sizes without block headers, transaction headers, explicit fees, lock times and aggregatable fields.

The reason I'm not counting fees and headers is because the sizes can vary considerably depending on the implementation. For example, Grin currently encodes the transaction fee with 8 bytes in the kernel, but that could be reduced to just 2-4 bytes with a more efficient encoding since the fee value doesn't need to support all possible monetary values. In fact, allowing too many choices for the plaintext fee value can be detrimental for privacy.

In other words, the sizes in this article are the theoretical protocol-level minimums.

2

And for your MJ, if I understand correctly, you add Z(33 bytes) in Kernel, add K(33 bytes) in each Output, add signature(64 bytes) in each Input, add K(33 bytes) (duplicated to spending Output K) in each Input, and add O_z (32 bytes) for each transaction. So, a 2-2 TX in MJ add total 33+2*(64+33)+2*33 = 293 bytes, plus 32 bytes if counting the additional offset O_z. How do you count the 1714? (1714-1376=338).

1714 for MJ consists of 2 inputs (2*96 bytes), 1 kernel (128 bytes) and 2 outputs (2*697 bytes). The output fields are described in §4.3.2, which explains the size of 697 bytes. Offsets are not counted in the IBD size as they can be aggregated per block.

3

And regarding the "A spent transaction will leave 320 bytes in the blockchain (128 bytes for the kernel and 96 bytes per input)." in "4.5 Minglejingle transaction", 128 looks missing the count of transaction fee element?

Fees are excluded from the calculation (explained above). In practice, it would probably be around 330 bytes with fees and the support for time-locked outputs.

4

About the "2.6.4 Non-interactive half-aggregation" and "The two offsets and the values of s_i can be aggregated for the whole block", Does the "half-aggregation" and the "aggregation" have the same security assumption? It's appreciated to give the reference if the half-aggregation has the related paper or proofs.

I couldn't find a formal proof, but it's not a new idea. I'm fairly sure it can be proven secure in the random oracle model.

5

About the STXO size. You list -640 for MW and -697 for MJ. But in my view, it should be 708=33+675 for MW and 33 more for MJ. Could you please clarify these numbers?

640 for MW refers to the removal of the input (32 bytes) and the spent output (608 bytes). For MJ, only the output can be removed at 697 bytes. See above for the explanation of the sizes.

@DavidBurkett

This comment has been minimized.

Copy link

@DavidBurkett DavidBurkett commented Feb 4, 2021

I don't know if bulletproofs+ supports this, but for classic bulletproofs over secp256k1, we have the ability to hide some additional data (about 31 bytes) into the bulletproof[1][2]. Perhaps you could use that to hide the amount and nonce and save these 24 bytes:
image

Instead of generating the nonce mask and value mask with the stream cipher, you could use the cipher to generate the secret key used to hide the bulletproof data.

[1] https://github.com/mimblewimble/secp256k1-zkp/blob/5b39fbf7f1e16e126404791923511723c4c0876d/include/secp256k1_bulletproofs.h#L181
[2] https://github.com/mimblewimble/secp256k1-zkp/blob/5b39fbf7f1e16e126404791923511723c4c0876d/include/secp256k1_bulletproofs.h#L134

@kaidiren

This comment has been minimized.

Copy link

@kaidiren kaidiren commented Mar 14, 2021

wow,cool idea。Is there any new development?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment