RobinLinus/sats4files.md Secret

## sats4files.md

      
    Raw
  

              sats4files.md
            
          
    Sats4Files: Decentralized File Hosting based on Lightning Payments

This protocol aims to align the monetary incentives for decentralized file hosting, such as for Nostr relays; or for p2p file sharing networks like BitTorrent; or a decentralized content delivery network (CDN), in the context of concepts to decentralize the web such as web5.
Given a file with identifier file_id. The following scheme allows a client to request file_id from a random untrusted server, which responds with file for a payment in coins. The trade is atomic.
The server encrypts the file such that the client can verify that they can decrypt the file if they learn the secret of key S. The client purchases this decryption key in a smart contract, e.g., a PTLC on the Lightning Network, which makes the trade atomic. A malicious client, that already knows some chunks of the file, should not be able to break the scheme and decrypt the rest of the file without knowing the secret of S. For each request of any client, a unique decryption key S must be chosen for Lightning Payments to be secure.
The scheme described here requires about 𝒪( file_size / 32 bytes ) many curve point operations. Unfortunately, that is still too slow to serve lots of clients at once. However, further optimizations might be possible.
The original scheme is much faster, but it was broken in the sense that a malicious client can decrypt the file if they know any chunk of it.
Intro

This scheme is based on the idea that in the prover-verifier setting we can perform operations on an elliptic curve 𝙃 "in the scalar" of another elliptic curve 𝙂.
This is possible because the prover can act as Diffie-Hellman oracle by proving the required multiplications to the verifier in the scalar.
The Implicit Representation

We choose some secure curve 𝙃, defined over 𝔽_n, which is the scalar field of the curve 𝙂. Any point P = (x,y) ∈ 𝙃 can be represented "in the implicit representation" using ❬P❭ := (xG, yG). The crux of being able to perform curve point operations in the implicit representation is to prove multiplications in the scalar. This is possible using a proof of multiplication, as described in the appendix.
When 𝙂 is secp256k1 then an obvious choice for 𝙃 would be secq256k1 (sic!). The equation for secq256k1 is simply
y^2 = x^3 + 7 (mod n)

wheras n is the order of secp256k1. Surprisingly, the order of secq256k1 is p, the order of the base field of secp256k1.
Setup


The author shares the file with the server
The author splits up the file into 32-byte chunks: m_1, ..., m_n
The author computes file_id = Hash( m_1 G | ... | m_n G ) and shares it with the client
Choose random generators of group 𝙃: H_1, ..., H_n known to server and client

Encryption

The server performs the following steps to convince the client

Send adaptor point S = sG
Send public nonces T_1 = t_1 G, ..., T_n = t_n G
Prove equality of scalar for S = sG and T_1 = ❬s H_1❭, ..., T_n = ❬s H_n❭ in the implicit representation. So t_1, ..., t_n are x-coordinates of points on 𝙃, which are hidden in scalars of points on 𝙂
Send c_1 = m_1 + t_1, ..., c_n = m_n + t_n

Verification

Given T_1, ..., T_n, and c_1, ..., c_n, and S. The client performs the following steps to verify the server's claim

Compute m_1 G = c_1 G - T_1, ..., m_n G = c_n G - T_n
Verify file_id == Hash( m_1 G | ... | m_n G )
Verify the proof of Equal Scalars for S = sG, and T_1 = ❬s H_1❭, ..., T_n = ❬s H_n❭ in the implicit representation

Decryption

The client performs the following steps to purchase and decrypt the file

Buy s in a PTLC using S
Compute the points s H_1, ..., s H_n. These points' x-coordinates are the t_1, ..., t_n
Compute m_1 = c_1 - t_1, ..., m_n = c_n - t_n

Appendix

Proof of Multiplication

Given G, xG, yG and xyG. The prover wants to prove to the verifier that x and y have been multiplied indeed.
Prove


Chose r randomly.
Compute R_1 = rG and R_2 = r (yG)
Compute c = Hash( R_1 | R_2 )
Compute s = r + c * x

Verify

Given R_1, R_2, and s

Compute c = Hash( R_1 | R_2 )
Verify sG == R_1 + c (xG)
Verify s (yG) == R_2 + c (xyG)

Proof of Point on Curve in the Implicit Representation

Given a curve point in the implicit representation ❬P❭ = (xG, yG). The prover wants to convince the verifier that the point is on the curve 𝙃 indeed.
y^2 G == x^3 G + a xG + bG

This requires one proof of multiplication for y*y == y^2 and two to compute x*x*x == x^3.
Proof of Curve Point Addition in the Implicit Representation

Given three curve points in the implicit representation
❬P❭ = (x_1G, y_1G)
❬Q❭ = (x_2G, y_2G)
❬R❭ = (x_3G, y_3G)

The prover wants to prove to the verifier in the implicit representation that for these three curve points the equation
❬P❭ + ❬Q❭ == ❬R❭

holds.
For two distinct points we can naively apply the point addition formula:
l = (y_2 - y_1)/(x_2 - x_1)
x_3 = l**2 - x_1 - x_2
y_3 = l*(x_1-x_3) - y_1

This requires one proof of multiplication for each of the three equations. All additions in the equation can be easily performed in the scalar by the verifier.
Proving the inversion in the first equation can be done cheaply. The prover provides the result l as a hint
l = (y_2 - y_1)/(x_2 - x_1)

and then proves the multiplication
l * (x_2 - x_1) == (y_2 - y_1)

It might be possible to batch all three proofs of multiplication into one, because they all multiply with l.
Proof of Equal Scalars

Given generators G and H of two different groups, 𝙂 and 𝙃, and some point xG and xH on each curve. The prover wants to prove to the verifier that the scalar of xG and xH is equal.
Prove


Chose r randomly.
Compute R_1 = rG and R_2 = rH
Compute c = Hash( R_1 | R_2 )
Compute s = r + c * x

Verify

Given R_1, R_2, and s.

Compute c = Hash( R_1 | R_2 )
Verify sG == R_1 + c (xG)
Verify sH == R_2 + c (xH)

Proof of Equal Scalars in the Implicit Representation

Given generators G and H of two different groups, 𝙂 and 𝙃, and some point xG on one curve and a point in implicit representation, ❬xH❭, on the other curve. The prover wants to prove to the verifier that the scalar of xG and ❬xH❭ is equal. We slightly modify the conventional protocol.
Prove


Chose r randomly.
Compute R_1 = rG and ❬R_2❭ = ❬rH❭
Compute c = Hash( R_1 | ❬R_2❭ | xG | ❬xH❭ )
Compute s = r + c * x
Prove implicitly ❬s H❭ = ❬R_2❭ + c * ❬xH❭ ( "prove implicitly" means "prove in the implicit representation" )

Verify

Given R_1, ❬R_2❭, and s.

Compute c = Hash( R_1 | ❬R_2❭ | xG | ❬xH❭ )
Verify implicitly sG == R_1 + c * (xG)
Verify implicitly ❬s H❭ == ❬R_2❭ + c * ❬xH❭

This requires the prover to prove a point addition and a scalar multiplication in the implicit representation.
Optimizations


Using a twisted Edwards curve for 𝙃 might be slightly more efficient.
We can batch multiple such proofs into a single s and public nonces ❬R_1❭, ..., ❬R_n❭
Use a digest size smaller than 32 bytes for c. This way the scalar multiplication becomes cheaper.

Batched Proofs of Equal Scalars in the Implicit Representation

Given generator G of group 𝙂 and a set of independently random generators H_1, ..., H_n of group 𝙃. The input is some point xG and a set of points in implicit representation, ❬xH_1❭, ..., ❬xH_n❭, on the other curve. The prover wants to prove to the verifier that the scalar x of xG and all ❬xH_i❭ are equal. We can batch that to make it more efficient for the prover.
Setup


Compute the sum of the generators H = H_1 + ... + H_n known to both prover and verifier

Prove


Chose r randomly.
Compute R_1 = rG and ❬R_2❭ = ❬rH❭
Compute c = Hash( R_1 | ❬R_2❭ | xG | ❬xH❭ )
Compute s = r + c * x
Prove implicitly ❬xH❭ == ❬xH_1❭ + ... + ❬xH_n❭
Prove implicitly ❬s H❭ == ❬R_2❭ + c * ❬xH❭

Verify

Given R_1, ❬R_2❭, and s.

Compute c = Hash( R_1 | ❬R_2❭ | xG | ❬xH❭ )
Verify sG == R_1 + c * (xG)
Verify implicitly ❬xH❭ == ❬xH_1❭ + ... + ❬xH_n❭
Verify implicitly ❬s H❭ == ❬R_2❭ + c * ❬xH❭

The batched proving time is linear in the number of points n. In total, the scheme requires n+1 implicit additions, but only a single implicit multiplication.
If it is true that one cannot compute any xH_i from xH, then xH can be public and ❬s H❭ == ❬R_2❭ + c * ❬xH❭ can be proven in the plain setting simply with s H == R_2 + c * xH. This saves the implicit multiplication.
Ideas for Optimizations


Can the server precompute a generic encryption for each file once and then derive a specific encryption in a faster way for specific requests?
We can provably map discrete logarithms between groups. Can we map the problem to a smaller group to make it faster? Can we make security tradeoffs for low-value files?