sr-gi/towers-accumulators.md

## towers-accumulators.md

      
    Raw
  

              towers-accumulators.md
            
          
    Watchtowers

At a high level, the gist of watchtowers is pretty straightforward: every single time you update your channel, you share revocation information with the tower such that, if you happen to be offline sometime in the future, the tower will have you covered. This is done literally for every payment (actually twice per payment, but that is sort of irrelevant for now), so it is easy to see how storage will easily be built up on the tower side.
If you start digging a bit deeper into how towers actually work, a question is likely to arise sooner than later: how can I be sure the watchtower will act on my behalf? And that is, indeed, a pretty damn good question which answers is likely to disappoint you: you cannot. To sheer some light into why it is worth checking what is the tower's "job":
In a nutshell, the tower should simply look for channel breaches onchain and respond to them if seen. In order to do so, the tower receives data from the user in the form of a locator and the revocation data that is supposed to be used if a channel gets closed with an old commitment. Do not worry about what this data looks like for now, it is enough to understand that the tower will be able to identify a channel breach using the locator, and it will be able to react to the breach using the revocation data.
Therefore, the tower has three main jobs (or a job composed of three tasks):

Store the data provided by the user
Watch the Bitcoin network for breaches
Respond to a breach using the corresponding revocation data

For the two first tasks, it would be fairly trivial to design a protocol that pools or tests the tower to check if it is holding the data the user has sent to it and if it is detecting triggering conditions (i.e. channel breaches) to respond with that data. However, these tests would be futile, given they are not giving us any actual guarantee that the tower will ever respond to a breach in a real case scenario.
The reasoning behind this issue is that there is no incentive for the tower to be honest, or seeing this from the opposite perspective, there is no punishment for the tower for misbehaving. In fact, the user does not even have proof that the data was ever sent to the tower.
So, what do we do? Is this it? Are we hopeless at this point? Fortunately, it is not.
The main problem with the aforementioned system is that the tower is not being held accountable for any of its actions, so the user is mainly blindly trusting in the good faith of the service. This may be fine as long as you are running this for and by yourself, or maybe for some friends.  However, it is easy to see how this won't be of much use for third-party services which you may (and should) not trust. So what is the solution to that? Accountable towers (** here comes the magic of cryptography **)
Plugging in accountability

Following one of the main mantras of Bitcoin, it would be great if you could not just trust the towers by verifying that they are doing their job, or be able to call a tower off if it is not so others do not trust the service anymore. Turns out we do have the means to do that by re-using some of the tools that we may be already familiar with: digital signatures and public key cryptography.
If instead of just sending data to a tower from a user we do exchange signed proof that this exchange has happened we can prove later on that:

The user did indeed send some data to be watched
The tower agreed on watching that data

This means that both parties will need to have a public key known by the counterparty which will be used as identity, and a secret private key that will be used to sign the requests. The generated signature will be sent alongside the data and kept by each party.
The Eye of Satoshi

For the Eye of Satoshi, we are currently following the aforementioned accountable mode in were signatures are exchanged and kept by both ends on every interaction. Now is time to actually show how the data exchanged between the user and the tower looks like:
                                ┌── commitment_txid[:16]
                                │
                   ┌────────────┼─────┐
                   │            │     │  ┌─── encrypt(penalty_tx, sha256(commitment_txid))
                   │  locator ──┘     │  │
                   │                  │  │
appointment ─────► │  encrypted_blob ─┼──┘
                   │                  │
                   │  to_self_delay   │
                   │                  │
                   └──────────────────┘

The user sends an appointment for every channel updated, containing, most importantly, a locator and an encrypted_blob. The locator is derived from the just revoked commitment transaction id, while the encrypted blob is the penalty transaction spending from the revoked commitment, encrypted using the commitment transaction id as key. In this way, a tower is able to identify a breach by its locator by just looking at the transaction that the underlying bitcoin node is receiving, and it is only able to decrypt a blob if the corresponding commitment transaction is seen.
The signature scheme used by the user and the tower works as follows:
appointment_hash = sha256(locator|encrypted_blob|to_self_delay)
user_signature = sign(appointment_hash, user_sk)
tower_signature = sign(user_signature, tower_sk)

The user simply signs the hash of the serialization of the appointment and sends that to the tower alongside it, committing in this way to the data. If the tower agrees on watching this piece of data, it will simply return a signature of the user's signature, which already committed to the data it has received.
This, however, has an easy-to-see consequence on both ends: the required storage will increase. For the tower, which was already storing a considerable amount of data on behalf of the user this may be less of an issue, but for the user, this is undesirable, given we now need to store signed proof of interaction for every single payment we perform in our channel. The current storage for both sides looks as follows:
UUID = ripemd160(locator|user_id)

tower
-----
UUID:(appointment, start_block, user_id, user_signature)

user
----
(locator, tower_id):(start_block, user_signature, tower_signature)

Data deletion

Accountability also comes with an interesting challenge, how do we manage data deletion?
Under general constraints (not using accountability), data deletion will be pretty straightforward: the user requests something to be deleted and the tower potentially does so.
However, in accountable mode, it is not that easy. If data deletion is requested, the tower needs proof of that in order to delete the data, otherwise, the user could claim it misbehaved by not having some data that it has agreed on having. However, that means that in order to delete data, the tower needs to keep other data. This makes aligning the incentives for users and towers to delete data way more difficult.
So we currently have two issues when using accountable towers: the storage of the state by both parties and the deletion of parts (or all) of it on request. An excellent solution would be able to accumulate all that state in a single item that both parties can agree on, and just kept that single accumulated piece of data alongside a signature on the client side, so storage is not just low, but constant.
Accumulators

The goal of using an accumulator is twofold:

Being able to prove data deletion without having to store a record of every deletion
Try to reduce the current storage requirements, especially on the client-side

Both sides will need to store the accumulator root, and its' signature by the counterparty.
Accumulate locators

Pros:


The client may not need to keep a list of locators. This could, potentially, be computed by the node on a channel close and be provided to the watchtower client, then, it will decide how to request data deletion for it to the tower (e.g. delayed deletion, deletion in batches, ...)

Even if we do keep a list of locators, we can just keep the funding_tx:locators pair, there's no need to keep a signature for everyone


The tower does not need to store any signature by the user, just the signature of the accumulated state. If a user wants the tower to prove a certain locator is in the accumulator it can easily do so, and given the user has signed the root of the accumulator it means he had knowledge of what the state was.


When a tower misbehaves (by not sending the penalty to a corresponding breach) it is trivial to get the corresponding locator (and prove where it came from). It is derived from the breach transaction id found on chain.


Cons:


Locators do not fully commit to the data provided by the user to the tower. They only commit to a revoked commitment_tx, but not the provided penalty. This means that the tower could claim the encrypted blob contained junk, and given signatures of every appointment are not kept anymore, the user won't be able to prove the tower is lying.

Accumulate appointment hashes

Pros:


The client doesn't need to store the locators nor the tower signatures from any state but the accumulator root, given here the accumulated data actually commits to the data bundle sent to the tower (i.e. to both the locator and the encrypted_blob).

Cons:


Asking for data deletion means the client needs to regenerate every single state on channel close, because appointment deletion requires the appointment_hash. This is something a node implementation may not be OK with doing.

A workaround for this is for the tower client to store the locator:appointment_hash pair for every submitted state, this way updates can also easily be handled: if an item is updated, the tower can provide the user proof that an old item has been deleted, alongside the bits to recreate the old item (that will hash to appointment_hash) and a proof that the new item has been added.
Another alternative (given LN nodes do not use updates whatsoever) is to store a funding_tx:appointment_hashes pair, where the latest is a list of all appointment hashes. This way, on channel close, the user knows what it needs to request the tower to delete without needing to query the backend whatsoever.


Proving misbehavior doesn't seem to be trivial though. The user cannot regenerate an appointment hash from data on chain, given the penalty is missing. Also, even if appointment hashes are kept (alongside the corresponding locators) a user cannot prove without the help of the tower that the locator:appointment_hash pair actually holds**.

** You may be asking yourself why does not the node simply re-recreate the penalty transaction itself, well it may not be as easy:
the penalty transaction generated when the data was sent to the tower may have different fees than the ones obtained when querying the backend to generate the proof (assuming this can be done), therefore the encrypted_blob will be different and the signatures won't match. If we assume a penalty can be gotten from the backend on demand,  given a commitment_txid (or a shachain depth) and a feerate, then we could just also keep that data around (or brute-force it until we find a match).
Accumulate both

A solution with two accumulators, one for locators and one for appointment_hashes may also work. For this, every time an item is added to the locators accumulator another has to be added to the appointment_hashes accumulator. Both roots are signed.
Pros:

If a breach is not responded to, a user can request a tower to prove it had some data by just using the locator (which can be obtained from a non-responded penalty). Given both accumulators are linked, the only valid outcomes are:

The tower cannot (or refuses to) provide the appointment data, in which case it is misbehaving
The tower can provide the requested data, but it cannot prove the data is in the appointment_hashes accumulator, in which case it is misbehaving
The tower can provide the requested data and can prove it is part of the appointment_hashes accumulator, in which case the user was wrong.


Cons:

We may need to store, for every channel, a locator, appointment_hash pair for every updated state so we are able to request deletions. Otherwise, we need to rely on the backend to regenerate all this on a channel close, which is unlikely to be implemented.

The storage requirements for this are not that bad though, this is 48 bytes per appointment, so ~46MiB per 1 million updates. This is a ~30% reduction on the current storage: 68 bytes per appointment, storing the locator:signature pair.


Accumulator assumptions

We have been assuming that our accumulator can:

Append/delete data given only the accumulated state
Generate proofs of membership
Generate proof of non-membership

The latest turns out not to be true, to the best of my knowledge, for hash-based additive accumulators (like utreexo), we may need to either keep two accumulators, one for additions and one for deletions or use something more generic (and storage intensive), like a sparse Merkle tree
Considerations


We currently delete appointments that are triggered but yield a wrong data decryption. This covers both decrypting junk, and a false positive trigger (a locator collision). We should not do so, otherwise, we won't have the data we'll be challenged with, so that would mean misbehavior
This protocol only allows users to challenge the tower during their subscription period, given after that the tower will wipe all data. This is something to have in mind.