Instantly share code, notes, and snippets.

# RobinLinus/sats4files.md

Last active October 9, 2023 21:00
Show Gist options
• Save RobinLinus/bb1368ce812935c02948a193aedc386c to your computer and use it in GitHub Desktop.
Sats4Files: Decentralized File Hosting based on Lightning payments

# Sats4Files: Decentralized File Hosting based on Lightning

Sats4Files is a protocol for decentralized file hosting. It allows users to request data from untrusted servers, and upon receiving the encrypted data, they can pay for the decryption key via Lightning. The exchange is atomic, ensuring that the user only receives the data if they pay for it, and the seller only gets paid if they deliver the data. The protocol is an efficient form of verifiable encryption, which is similar to verifiable secret sharing using Shamir's algorithm.

This scheme is simplified and does not fully solve the problem, as discussed in the Limitations section. This writeup intends to spark interest to solve the remaining issues beyond the fixes that we suggest.

## Sats4Files Problem

The client wants to buy from the server the `file` corresponding to a particular `file_id`.

Here, we assume we have PTLCs on Lightning instead of HTLCs. That means we can buy a discrete logarithm over Lightning.

## Polynomial Encoding

1. Chunk up the `file` into `n` 32-byte chunks `chunk_1`, `chunk_2`, ... , `chunk_n`
2. Compute the Lagrange polynomial `f(x)` using `f(1)=chunk_1`, `f(2)=chunk_2`, ... , `f(n)=chunk_n`
3. Sample `n` many out-of-domain (OOD) points `f(n+1)`, `f(n+2)`, ... ,`f(n+n)`

## Encryption

1. Send to the client `n-1` many of those points `f(n+1)`, `f(n+2)`, ... ,`f(n+n-1)`. This way, the client learns nothing about `f` yet, because they need one more point to reconstruct the polynomial.
2. Send the last OOD point encrypted: `f(n+n)G`. This is the used 'preimage' in our PTLC.

## Verification

1. The client computes the other `n-1` many encrypted OOD points `f(n+1)G`, `f(n+2)G`, ... ,`f(n+n-1)G` from the unencrypted points
2. Use Lagrange interpolation with the `n` many encrypted OOD points to compute the `n` many encrypted chunks `f(1)G`, ..., `f(n)G`
3. Hash the encrypted chunks and verify that it is the `file_id`

## Decryption

1. The client buys the remaining unencrypted value `f(n+n)` of the encrypted point `f(n+n)G` via Lightning in a PTLC
2. Use Lagrange interpolation with the `n` many unencrypted OOD points to compute the `n` many unencrypted chunks `f(1)`, ..., `f(n)`

## Limitations and Optimizations

A malicious client can decrypt the file if they know only one chunk. This modification fixes the issue.

1. In Lightning, you can use a preimage only once safely. So, the server has to reblind the polynomial for each client and each request. Therefore, they always add another random point `f(n+1)` to the domain. (This increases the degree of the polynomial by 1)
2. The `chunks_1`, ..., `chunks_n` are not uniformly random, because they represent, e.g., text files. We can solve that by adding upfront another point to the domain `f(0)=chunk_0`, which will be used as seed to obfuscate the chunks like `chunk_1 xor H(chunk_0 || 0)`, ... ,`chunk_n xor H(chunk_0 || n)` to hide all structure and make them look uniformly random. The new `file_id` then becomes the hash of `chunk_0`, ... , `chunk_n`.
3. The connection between the PTLC and the polynomial should be one-way. That means that the client can buy a dlog via the PTLC, but if they can decrypt the polynomial (for example, because they know the file already) they should learn nothing about the PTLC. That ensures they can't tamper with the payment route.
4. In general, it makes sense to publish the files in a compressed format, to minimize the amount of data having to get processed by the verifiable encryption.
5. It is possible to encrypt multiple files with the same key, such that they can be batched into a single Lightning payment buying them all at once. However, if a malicious client already knows any one of the files they can compute the decryption key for all of them.
6. Larger files can be chunked up into smaller chunks such that the maximum degree of polynomials is bound to fit consumer hardware easily.

Here's a basic implementation of the scheme.