Skip to content

Instantly share code, notes, and snippets.

@bkomuves
Created November 23, 2022 13:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bkomuves/d4e773af723b1a3752445d759c3cb1d3 to your computer and use it in GitHub Desktop.
Save bkomuves/d4e773af723b1a3752445d759c3cb1d3 to your computer and use it in GitHub Desktop.
ZK Hack 3 sumcheck puzzle writeup

ZK Hack III, first puzzle (univariate sum-check protocol)

Introduction

The puzzle description was pretty minimal. We can infer the following pieces of information from it:

  • we are given a secret polynomial f, and the ZK argument should prove that the sum of its values on some specific subset is zero
  • we should find a way to cheat, and "prove" this while the sum is actually nonzero; and also keep the actual sum hidden from the verifier

Looking at the source code, we can see the general parameters:

  • it's a univariate polynomial (over the BLS12-381 scalar field, but this won't matter);
  • it has degree 30, and the evaluation domain, on which we calculate the sum, has 16 elements: it's a multiplicative subgroup H of size N = 16 = 24 (so a standard FFT friendly domain);
  • the polynomial commitment scheme is Marlin's batch version of the KZG scheme, but this won't matter either.

So we are looking at a univariate sum-check protocol (as opposed to the more popular multilinear sum-check protocol, as described for example in Thaler's book).

Looking more deeply, the prover is not implemented at all (that's our task!), but the verifier is there, and we shall find an issue in that which we can exploit with a malicious prover.

The verifier looks pretty simple:

  • it receives the statement, which consists of the evaluation domain, a commitment to our polynomial f(x), and the expected sum (which will be just zero in this task)
  • it also receives the proof, which consists of four openings of four polynomials: f, s, h, g, together with commitments to the latter three (f was already commited in the statement), and a batch evaluation proof for the openings.
  • and then it checks the evaluation proofs, and some equation

The sum-check protocol

Ok, time to figure out what exactly this protocol is supposed to be! The linked reference material is not really helpful, but after a few minutes of google-ing, I found this review paper:

which describes (on page 25, section 4.1.1) a univariate sum-check protocol which now looks eerily familiar! Even most of the symbols are the same, though for some extra confusion, what is our g, he calls p, and as a bonus he also uses g for something else.

So how does this protocol work? The simpler, non-ZK version proves that the sum of f(x) over x∈H is a given field element γ ∈ 𝔽. To do this, first divide (using polynomial long division, or in practice, a specialized version of it) our polynomial f(x) by the vanishing polynomial ZH, which is defined as the smallest (nonzero) polynomial which takes zero on all x∈H:

    ZH(x) := ∏h∈H (x-h) = xN - 1

The arkworks function for this division, is helpfully implemented as divide_by_vanishing_poly(), and it computes the quotient h(x) and the remainder r(x):

    f(x) = h(x) ⋅ ZH(x) + r(x)

with deg(r) < N. Now, the first observation is that of course the first term, by definition, vanishes on H, so it doesn't contribute to the sum:

    γ := ∑h∈H f(h) = ∑h∈H r(h)

More interestingly, the higher degree terms of r(x) do not contribute either! That's because of the following fact:

    ∑h∈H hk = 0    for 0 < k < N

Why is this true? Well, H is a multiplicative subgroup, that is, it is generated by some generator u∈𝔽 (the letter g is unfortunately already used):

    H = {   ui   |   i∈{0..N-1}   }

Rewriting the sum, we get

    ∑h∈H hk   =   ∑i=0N-1 (ui)k   =   ∑i=0N-1 (uk)i

but u∈H, hence also all its powers uk too, are N-th roots of unity, whose powers sum to zero (to have some intuition about this, imagine the complex plane instead of finite fields: they just go around the unit circle symmetrically, and sum to zero because of this symmetry). The only exception is when k is divisible by N=16 (which in our case means k=0), in which case each term of the sum is uN = 1, hence the final sum is just N (but interpreted as a field element).

After this little detour, we can get back to our protocol! What we just convinced ourselves about means that only the constant term of the remainder polynomial r(x) actually contributes to the sum! So let's rewrite

    r(x) = c + x ⋅ g(x)

separating the constant term c and the rest. Now our equation looks like this:

    f(x) = h(x) ⋅ ZH(x) + x ⋅ g(x) + c

and the sum is simply:

    γ   =   ∑h∈H f(h)   =   ∑h∈H ( h(x) ⋅ ZH(x) + x ⋅ g(x) + c )   =   0 + 0 + N ⋅ c   =   N ⋅ c

In other words, c = γ/N. Now we are ready to put together the sum-check protocol:

  • the prover computes, using polynomial division, h(x) and g(x) as above
  • commits to these polynomials
  • then the verifier checks the equation f(x) = h(x) ⋅ ZH(x) + x ⋅ g(x) + γ/N
  • by evaluating both sides at a random pont xi∈𝔽 (this works because of the Schwartz–Zippel lemma)

Looking at the verifier source code, it implements almost exactly this, but there is also an additional polynomial s(x), and the equation is modified to

    f(x) + s(x) = h(x) ⋅ ZH(x) + x ⋅ g(x) + γ/N

Cheating using the masking polynomial

The problem description vaguely hints that we should keep the actual sum secret, or else bad things may happen in real life. Presumably the role of the extra polynomial s(x) is to hide information; it's even referred (implicitly) as the "masking polynomial" in the problem description.

However, there is actually no check whatsoever about s(x) in the code! In fact, the only check apart from the above equation is a degree bound check on g(x):

    deg(g) ≤ N - 2

This is a strong hint that we can abuse s(x) to cheat, and indeed we can! The first idea to simply cancel out the real sum γreal by defining s(x) to be the constant polynomial:

    s(x) := - γreal / N

However, with this definition, the verifier can deduce the actual sum, as it evaluates s(xi) = -γreal/N. So this does not yet fully satisfy the puzzle description.

No worries though, we can choose s(x) to be whatever we want, as there are absolutely no extra checks about it in the verifier! For example we can just add some random higher-order terms, masking this information:

    s(x) := - γreal / N + x ⋅ t(x)

    t(x) := ∑i=0N-2 (ti ⋅ xi)

where ti∈𝔽 are random field elements. Of course we have to fix the equation too, but that's easy: just roll the freshly introduced "error terms" into g(x) on the right side, too:

    g'(x) := g(x) + t(x)

Now the equation looks like this:

    f(x) + s(x)   =   f(x) - γreal / N + x ⋅ t(x)   =   h(x) ⋅ ZH(x) + x ⋅ ( g(x) + t(x) ) + γfake / N

with γfake = 0 in our case (but we could fake any other sum, too). Cancelling out t(x) from the two sides, and moving γreal to the other side, we get back the original equation, that is, we were able to cheat:

    f(x) =   h(x) ⋅ ZH(x) + x ⋅ g(x) + γreal / N

We could also introduce even higher order masking terms, and carefully roll them into h(x), but unless s(x) is evaluated at as many points as the size of domain N, this should be enough to hide the information. The problem description hints at s(x) being evaluated twice (as part of a larger of a larger, unspecified protocol), hence this is already more than enough.

Corrected ZK protocol

After solving the problem, one wonders how the correct protocol with masking polynomial would look like? Well, it is very easy to eliminate the above cheat: Just ensure that the constant term of s(x) is zero. The simplest way to do this is to modify the verifier's equation to

    f(x) + x ⋅ s'(x) = h(x) ⋅ ZH(x) + x ⋅ g(x) + γ/N

where s'(x) is the new, "corrected" masking polynomial (replacing the old s). We should also check the degree bounds:

  • deg(g) ≤ N - 2
  • deg(s') ≤ N - 2
  • deg(h) ≤ deg(f) - N

Note that if for example s'(x) would have a nonzero term of degree N-1, then after the multiplication by x it would become a term of degree N, which would contribute a nonzero amount to the sum, again enabling the prover to cheat! This is also the reason why we need all this division by ZH business in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment