notnotraju/zkhackIVpuzzle2.md

## zkhackIVpuzzle2.md

      
    Raw
  

              zkhackIVpuzzle2.md
            
          
    Write-up for ZK Hack IV puzzle 2

1. BLS signatures

We briefly recall BLS signatures, following the summary given in section 1 of this article (from which we steal much of the notation, the organization, and also the LaTex). Our notation conforms to the notation in the puzzle
Let $$e\colon \mathbb G_1\times \mathbb G_2\rightarrow \boldsymbol{\mu}$$ be a non-degenerate bilinear map, with standard cryptographic assumptions. All groups in question have prime order $r$. In this note, group operations in $\mathbb G_i$ and $\boldsymbol{\mu}$ are written multiplicatively, while the group operation in the group $(\mathbb F_r,+)$ is writtenly additively.
Let $H\colon \mathcal M\rightarrow  \mathbb G_2$  be a hash function from the space of messages $\mathcal M$ to $\mathbb G_2$. Let $g_i$ be a generator of $\mathbb G_i$ for $i=1,2$. BLS works as follows.
Key generation

Sample a random $\alpha\in \mathbb F_r$. Then the private (signing) key is $sk:=\alpha$ and the public (verifying) key is $pk:=g_1^{\alpha}\in \mathbb G_1$. Note that this expression makes sense as $\mathbb G_1$ has size $r$.
Signing

Let $m\in \mathcal M$ be a message. Then the algorithm $\textbf{Sign}(sk, m)$ outputs a single element $\sigma:=H(m)^{sk}\in \mathbb G_2$
Verification

Given $\mathit{pk}$, $m$, and a signature $\sigma$, the verification algorithm $\textbf{Verify}(pk, m, \sigma)$ checks if the following equality in $\boldsymbol\mu$ holds: $$e(g_1,\sigma)=e(pk, H(m))$$(i.e., the algorithm returns 1 if the equality holds and 0 else). Note that this involves computing two pairings, which tend to be computationally expensive.
Note that this algorithm uses the following feature. While groups in which the discrete log problem is hard can be used to check linear constraints in the exponent, the existence of a bilinear form allows us to check quadratic (and hence all higher degree) constraints in the exponents.
Aggregation

Suppose we have $N$ secret (signing) keys $(sk_i)$ and a single message $m$, and we want an algorithm that verifies that each of the $N$ parties have signed $m$. Set $\sigma_i:=\textbf{Sign}(sk_i,m)$. The naive verification algorithm requires $2N$ pairings. Fortunately, there is a straightforward way to aggregate the signatures $\sigma_i$, namely set:

$\displaystyle \sigma_{\text{agg}}:=\prod_{i=1}^N \sigma_i\in \mathbb G_1$
$\displaystyle \mathit{pk}{\text{agg}}:=\prod{i=1}^N pk_i\in \mathbb G_1$
$\displaystyle \mathit{sk}{\text{agg}}:=\sum{i=1}^N sk_i\in \mathbb F_r$

Note that anyone who has access to the $pk$ and the $\sigma_i$ can indeed compute the first two aggregations. Moreover, $g_1^{sk_{\text{agg}}} = pk_{\text{agg}}$.
Then aggregate verification is given by: $$\textbf{Aggregate-Verify}((pk_i),m, (\sigma_i)):=\textbf{Verify}(pk_{\text{agg}},m,\sigma_{\text{agg}}).$$
In particular, aggregate verification will first compute $pk_{\text{agg}}$ and $\sigma_{\text{agg}}$ by multiplying in $\mathbb G_1$ and then run normal verification. It therefore seems as though given $(pk_i)$ and $m$, we only need a single signature $\sigma_{\text{agg}}$ to check that all parties have signed $m$.
2. Naive aggregation has a potential security flaw

This form of aggregation is amenable to a rogue key attack. Namely, suppose I am a crafty and malicious actor, and I want to convince someone that I and a set of parties with public (verification) keys ${pk_i}$ have all signed a given message $m\in \mathcal M$. I can pick any $\beta\in \mathbb F_r$ and declare my "public key" to be $$pk_{\text{fake}}:=g_1^{\beta}\prod pk_i^{-1}.$$ (Note that while this public key of course has a unique corresponding secret key, as $g_1\in \mathbb G_1$ is a generator, I do not know this secret key!)
Then I can "sign" (a.k.a. forge) our joint signature by sending off $\sigma_{\text{fake agg}}:=H(m)^{\beta}=\textbf{Sign}(\beta, m)$. The verifying algorithm above would compute the total aggregate public key (including my fake public key) is $g_1^{\beta}=pk_{\text{fake}}(\prod pk_i)$, and verification would pass.
Moreover, even if the verification algorithm needed to take in all of the individual claimed signatures, I can easily just send fake signatures $\sigma_{i,\text{fake}}\in \mathbb G_2$, with the property that:
$$H(m)^{\beta}=\prod\sigma_{i,\text{fake}},$$
which would mean that the verifier herself computes the aggregate of $\sigma_{i,\text{fake}}$ to be what I want, namely $H(m)^{\beta}$.
Proof of knowledge (or proofs of posession)

Note that my being able to forge a signature relied on me broadcasting a "public key" whose corresponding secret key is unknown to me. Therefore, one potential way of getting around the above security flaw is to force all of the parties to somehow prove that they "know" their corresponding secret keys (without revealing any bits of information about the keys themselves)! There are of course many ways of doing this. An interesting observation of this paper is that one can use the BLS scheme itself to include a proof-of-knowledge, namely: the prover has to broadcast $pk$ as well as the following signature
$$\textbf{Sign}(sk, \langle pk\rangle):=H(\langle pk\rangle)^{sk}\in \mathbb G_2,$$
where $\langle pk\rangle$ refers to a bit encoding of $pk$. (Note that in the definition of the $\textbf{Sign}(-,-)$ function, there was an implicit hash function $H\colon \mathcal M\rightarrow \mathbb G_2$. The hash function in the proof-of-knowledge part could of course be different than the original hash function; more honest notation would be $\textbf{Sign}_H(-,-)$.)
Therefore, there is some class of functions $F\colon \mathcal M\rightarrow \mathbb G_2$ such that the function $sk\mapsto (pk,F(\langle pk \rangle)^{sk})$ furnishes a computationally secure proof of knowledge.
3. The puzzle

The proof-of-knowledge the puzzle provides may be found in this code snippet
fn derive_point_for_pok(i: usize) -> G2Affine {
    let rng = &mut ark_std::rand::rngs::StdRng::seed_from_u64(20399u64);
    G2Affine::rand(rng).mul(Fr::from(i as u64 + 1)).into()
}

#[allow(dead_code)]
fn pok_prove(sk: Fr, i: usize) -> G2Affine {
    derive_point_for_pok(i).mul(sk).into()
}

fn pok_verify(pk: G1Affine, i: usize, proof: G2Affine) {
    assert!(Bls12_381::multi_pairing(
        &[pk, G1Affine::generator()],
        &[derive_point_for_pok(i).neg(), proof]
    )
    .is_zero());
}


In particular, given $sk_i$, the proof of knowledge is $pok_i:=g_2^{sk_i(i+1)}$. The verifier uses a pairing to verify that this is correct, namely, she checks that:
$$e(pk_i, g_2^{i+1})=e(g_1,(g_2^{i+1})^{sk_i})\in \boldsymbol{\mu}.$$
In sum, suppose parties $1,\ldots, N$ have secret keys $sk_i$ and public keys $pk_i$. To sign a message $m$, they each compute their individual signatures $\sigma_i:=\textbf{Sign}(sk_i,m)$ and then also send the element $pok_i:=g_2^{sk(i+1)}$ to prove they indeed know their secret key. Someone computes $\sigma_{\text{agg}}$ (this could be the verifier!) and $pk_{\text{agg}}$ and then the verifier computes $\textbf{Verify}(pk_{\text{agg}},m,\sigma_{\text{agg}})$.
The Attack

Once again, I am crafty and malicious. I pick some $\beta\in \mathbb F_r$ and I claim my public key is $$pk_{fake}:=g_1^{\beta}\prod pk_i^{-1}.$$
Due to the poor choice of proof-of-knowledge function, I can compute the proof of knowledge associated to the (unique but unknown) secret key corresponding to my claimed public key. Indeed: $g_2^{sk_i} = pok_i^{\frac{1}{i+1}}$, which means that
$$g_2^{(-N-1)sk_i} = pok_i^{\frac{-N-1}{i+1}}.$$ (The computations in the exponent are happening in $\mathbb F_r$: I can invert $i+1$ in $\mathbb F_r$ and then multiply by $-N-1$.) Then:
$$pok_{\text{fake}} = g_2^{\beta * (N+1)}\prod pok_i^{\frac{-N-1}{i+1}}.$$
It is a straightforward computation to see this will pass the verification. The key fact is that $pok_{\text{fake}}=g_2^{(N+1)(\beta-\sum{sk_i})}$ and $sk_{\text{fake}}=\beta-\sum{sk_i}$, so $pok_{\text{fake}}=g_2^{(N+1)sk_{\text{fake}}}$.