georgwiese/chaos_theory.md Secret

## chaos_theory.md

      
    Raw
  

              chaos_theory.md
            
          
    The 3rd ZK HACK IV puzzle puzzle is introduced as follows:

Bob designed a new one time scheme, that's based on the tried and true method of encrypt + sign. He combined ElGamal encryption with BLS signatures in a clever way, such that you use pairings to verify the encrypted message was not tampered with. Alice, then, figured out a way to reveal the plaintexts...

Let's dive in!
ElGamal Encryption

Assume we are given a group $\mathbb{G}$ with generator $G \in \mathbb{G}$ and a scalar field $\mathbb{F}$.
Then, ElGamal Encryption (described using additive notation) works as follows:


Key generation: Each party samples a secret key $sk \in \mathbb{F}$ and publishes the public key $PK := sk \times G \in \mathbb{G}$.

Encryption: Given a message $M \in \mathbb{M} \subset \mathbb{G}$ and the receiver's public key $PK_r \in \mathbb{G}$, we sample a random $x \in \mathbb{F}$ and output:

$C_1 := x \times G \in \mathbb{G}$
$C_2 := x \times PK_r + M \in \mathbb{G}$


The key insight is that the term $x \times PK_r$ can be computed by the receiver as follows:
$$
x \times PK_r = (x + sk_r) \times G = sk_r \times C_1
$$
Therefore, the reciever can recover the message as $C_2 - sk_r \times C_1$!
Let's call $x \times PK_r = sk_r \times C_1$ the shared secret between the sender and receiver, because security of the scheme will depend on it being known only by these two parties.
Semantic Security and the DDH Assumption

ElGamal encryption is semantically secure under the Decisional Diffie-Hellman (DDH) Assumption.
Informally, semantic security means that even if an eavesdropper can guess the message, the ciphertext does not help him or her to decide whether this message was encrypted.
The DDH assumption states that given a tuple $(a \times G, b \times G, C) \in (\mathbb{G}, \mathbb{G}, \mathbb{G})$ (for random $a, b \in \mathbb{F}$), it is infeasible to decide whether $C = (a + b) \times G$ or whether it is just a random group element.
If it was, $(a \times G, b \times G, C)$ would be called a DDH triplet.
Let's look into why the scheme is broken if the DDH assumpion does not hold.
For simplicity, let's a assume that the message space $\mathbb{M}$ is small.
Then, the attacker can decrypt the message simply by testing for each possible message $M$:

Compute a candidate shared secret $SEC := C_2 - M$. If $M$ was indeed the message, this value would be equal to $(x + sk_r) \times G$; otherwise, it will be equal to some other value.
Test whether $(PK_r, C_1, SEC)$ is a a DDH triplet. If yes, $M$ is the encrypted message!

Pairings and the DDH assumption

A pairing is a function $e: \mathbb{G}_1 \times \mathbb{G}_2 \rightarrow \mathbb{G}_T$ with the property that $e(a \times P, b \times Q) = (a + b) \times e(P, Q)$ for any $a, b \in \mathbb{F}, P \in \mathbb{G}_1, Q \in \mathbb{G}_2$.
Now, if $\mathbb{G}_1 = \mathbb{G}_2$, the existence of an efficiently computable pairing implies that the DDH assumption does not hold: $(a \times G, b \times G, C)$ will be a DDH tripple if and only if $e(a \times G, b \times G) = e(C, G)$.
If $\mathbb{G}_1 \ne \mathbb{G}_2$, breaking the DDH assumption in $\mathbb{G}_1$ is a bit tricker: In addition to $(a \times G_1, b \times G_1, C) \in (\mathbb{G}_1, \mathbb{G}_1, \mathbb{G}_1)$, you'd also need either $a \times P \in \mathbb{G}_2$ or $b \times P \in \mathbb{G}_2$ (for some $P \in \mathbb{G}_2$) to do the same trick. So let's see if Bob thought of that!
BLS signatures

BLS signatures require two groups $\mathbb{G}_1$ and $\mathbb{G}_2$ and a pairing function $e$ as defined above. Additionally, we need a hash function $h: \mathbb{M}_s \rightarrow \mathbb{G}_2$ which maps from the message space to $\mathbb{G}_2$.
The signature scheme works as follows:

The signer samples a secret key $sk \in \mathbb{F}$ and publishes the public key $PK := sk \times G_1 \in \mathbb{G}$

Given a message $m \in \mathbb{M}_s$, the signature is computed as $SIG := sk \times h(m) \in \mathbb{G}_2$

The verifier can check the signature by verifying that $e(PK, h(m)) = e(G_1, SIG)$


Bob's mistake

Bob wants to combine ElGamal encryption with BLS signatures, using the signature scheme to sign the ElGamal cyphertext.
Now, notice that if we're working over pairing friendly groups $\mathbb{G}_1$ and $\mathbb{G}_2$ and we do the ElGamal encryption in $\mathbb{G}_1$, the key generation is exactly the same as that of BLS signatures! Taking advantage of this, Bob implements his scheme as follows:

Each party generates a secret key $sk \in \mathbb{F}$ and public key $PK = sk \times G_1 \in \mathbb{G}_1$. This key pair will be used in both the encryption and signature scheme!
The message $M \in \mathbb{M} \subset \mathbb{G}_1$ is then ElGamal encrypted to $C = (C_1, C_2)$ using the sender's secret key as randomness:

$C_1 := sk_s \times G = PK_s \in \mathbb{G}_1$
$C_2 := sk_s \times PK_r + M \in \mathbb{G}_1$


The BLS signing algorithm is applied to the cyphertext (i.e. $\mathbb{M}_s = (\mathbb{G}_1, \mathbb{G}_1)$):

$SIG := sk_s \times h(C) \in \mathbb{G}_2$


Now, implemented like this, this let's us break the DDH assumption!
Suppose we have guessed the message and computed our candidate shared secret $SEC \in \mathbb{G}_1$, as described above. We can check whether $SEC = (sk_r + sk_s) \times \mathbb{G}_1$ by checking whether:
$$
e(SEC, h(C)) = e(PK_r, SIG)
$$
Note: Another problem with this scheme is that the randomness used in the ElGamal encryption is always the same. This means that if two messages $M^{(1)}$ and $M^{(2)}$ are sent between the same sender and receiver, the difference $C_2^{(1)} - C_2^{(2)}$ equals $M^{(1)} - M^{(2)}$, which leaks information! This is called a two-time pad attack. In this challenge, we are given only one cyphertext, so this doesn't apply.
The exploit

In code, the implementation of Bob's scheme looks like this:
impl Sender {
    pub fn send(&self, m: Message, r: &Receiver) -> ElGamal {
        let c_2: G1Affine = (r.pk.mul(&self.sk) + m.0).into_affine();
        ElGamal(self.pk, c_2)
    }

    pub fn authenticate(&self, c: &ElGamal) -> G2Affine {
        let hash_c = c.hash_to_curve();
        hash_c.mul(&self.sk).into_affine()
    }
}
The main() function loads an encrypted message into the Blob data structure:
#[derive(CanonicalSerialize, CanonicalDeserialize)]
pub struct Blob {
    pub sender_pk: G1Affine,
    pub c: ElGamal,
    pub s: G2Affine,
    pub rec_pk: G1Affine,
}

pub fn main() {
    welcome();
    puzzle(PUZZLE_DESCRIPTION);

    let messages = generate_message_space();

    let mut file = File::open("blob.bin").unwrap();
    let mut data = Vec::new();
    file.read_to_end(&mut data).unwrap();
    let blob = Blob::deserialize_uncompressed(data.as_slice()).unwrap();

    // ensure that blob is correct
    assert!(Auditor::check_auth(blob.sender_pk, &blob.c, blob.s));
    
    // Exploit goes here!
}
The Message space $\mathbb{M} \subset \mathbb{G}_1$ (fully materialized by generate_message_space()) only consists of 10 possible messages. That means we can just try all of them and test them one-by-one!
    for (i, m) in messages.iter().enumerate() {
        // `blob.c.1` was computed as `(r.pk.mul(&self.sk) + m.0).into_affine()`, so if m
        // was the message, the following recovers the shared secret
        let candidate_shared_secret: G1Projective = blob.c.1 - m.0;
        // Compute hash(C)
        let hashed_ciphertext: G2Affine = blob.c.hash_to_curve();

        // If m is the message, this will compute e((sk_s + sk_r) * G_1, hash(C))
        let lhs = Bls12_381::pairing(candidate_shared_secret, hashed_ciphertext);
        // This computes e(PK_s, SIG)
        let rhs = Bls12_381::pairing(blob.rec_pk, blob.s);

        // The two sides are equal iff and only if m is the message!
        if lhs == rhs {
            println!("Found the message index: {i}");
        }
    }
And, indeed, the check passes for one of the messages:
Found the message index: 3

How to fix it

To fix Bob's scheme, I would do two things:

Sample a random value $x$ to compute the secret key in the ElGamal encryption. This would prevent this specific attack if the attacker has neither $sk_r \times P \in \mathbb{G}_2$ or $x \times P \in \mathbb{G}_2$ (for any point $P \in \mathbb{G}_2$). As mentioned above, it would also prevent the two-time pad attack.
Use different keys for the encryption and signature scheme. Re-using keys has the risk of the eavesdropper obtaining $sk_r \times P \in \mathbb{G}_2$, for example if the receiver ever signs a message!