Sats4Files is a protocol for decentralized file hosting. It allows users to request data from untrusted servers, and upon receiving the encrypted data, they can pay for the decryption key via Lightning. The exchange is atomic, ensuring that the user only receives the data if they pay for it, and the seller only gets paid if they deliver the data. The protocol is an efficient form of verifiable encryption, which is similar to verifiable secret sharing using Shamir's algorithm.
This scheme is simplified and does not fully solve the problem, as discussed in the Limitations section. This writeup intends to spark interest to solve the remaining issues beyond the fixes that we suggest.
The client wants to buy from the server the file
corresponding to a particular file_id
.
Here, we assume we have PTLCs on Lightning instead of HTLCs. That means we can buy a discrete logarithm over Lightning.
- Chunk up the
file
inton
32-byte chunkschunk_1
,chunk_2
, ... ,chunk_n
- Compute the Lagrange polynomial
f(x)
usingf(1)=chunk_1
,f(2)=chunk_2
, ... ,f(n)=chunk_n
- Sample
n
many out-of-domain (OOD) pointsf(n+1)
,f(n+2)
, ... ,f(n+n)
- Send to the client
n-1
many of those pointsf(n+1)
,f(n+2)
, ... ,f(n+n-1)
. This way, the client learns nothing aboutf
yet, because they need one more point to reconstruct the polynomial. - Send the last OOD point encrypted:
f(n+n)G
. This is the used 'preimage' in our PTLC.
- The client computes the other
n-1
many encrypted OOD pointsf(n+1)G
,f(n+2)G
, ... ,f(n+n-1)G
from the unencrypted points - Use Lagrange interpolation with the
n
many encrypted OOD points to compute then
many encrypted chunksf(1)G
, ...,f(n)G
- Hash the encrypted chunks and verify that it is the
file_id
- The client buys the remaining unencrypted value
f(n+n)
of the encrypted pointf(n+n)G
via Lightning in a PTLC - Use Lagrange interpolation with the
n
many unencrypted OOD points to compute then
many unencrypted chunksf(1)
, ...,f(n)
A malicious client can decrypt the file if they know only one chunk. This modification fixes the issue.
- In Lightning, you can use a preimage only once safely. So, the server has to reblind the polynomial for each client and each request. Therefore, they always add another random point
f(n+1)
to the domain. (This increases the degree of the polynomial by 1) - The
chunks_1
, ...,chunks_n
are not uniformly random, because they represent, e.g., text files. We can solve that by adding upfront another point to the domainf(0)=chunk_0
, which will be used as seed to obfuscate the chunks likechunk_1 xor H(chunk_0 || 0)
, ... ,chunk_n xor H(chunk_0 || n)
to hide all structure and make them look uniformly random. The newfile_id
then becomes the hash ofchunk_0
, ... ,chunk_n
. - The connection between the PTLC and the polynomial should be one-way. That means that the client can buy a dlog via the PTLC, but if they can decrypt the polynomial (for example, because they know the file already) they should learn nothing about the PTLC. That ensures they can't tamper with the payment route.
- In general, it makes sense to publish the files in a compressed format, to minimize the amount of data having to get processed by the verifiable encryption.
- It is possible to encrypt multiple files with the same key, such that they can be batched into a single Lightning payment buying them all at once. However, if a malicious client already knows any one of the files they can compute the decryption key for all of them.
- Larger files can be chunked up into smaller chunks such that the maximum degree of polynomials is bound to fit consumer hardware easily.
Here's a basic implementation of the scheme.
This is it! 3rd times the charm :-)