Skip to content

Instantly share code, notes, and snippets.

@dominictarr
Last active April 26, 2023 04:28
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dominictarr/8476801 to your computer and use it in GitHub Desktop.
Save dominictarr/8476801 to your computer and use it in GitHub Desktop.
Writing out part of the Tahoe LAFS paper in my own words to check my understanding.

Tahoe LAFS

Tahoe LAFS is a distributed file system with an interesting permissions model. (whitepaper) Both Immutable and Mutable files are supported (Mutable is the most complex and interesting) There are three levels of permissions, Write, Read, and Verify. Each permission is granted by giving a user a special key called a "capability". If you have the Write capability you can update the file, if you have the Read capability you can retrieve the plain text, but if you only have the Verify capability you can only validate the file integrity, but not read the contents.

The lower level capabilities are generated deterministically from the higher level capabilites. So, someone who has the Write capability can generate the Read capability and give it to someone, who will then be able to read the plain text of that file.

The Tahoe LAFS paper describes two methods of implementing this 3-layer capability model, I'll just discuss the first one.

Methods

here are some methods I'll use to describe the system, the specific cryptographic algorithms used for each are given in the paper.

// hash a text (or binary blob)

   sum = hash(text)

// encrypt a text/blob

   ciphertext = encrypt(key, plaintext)

// sign a text/blob.

   signature = sign(private_key, text)

// generate a random salt

   salt = random()

// generate the public key to a given private key

   public_key = public(private_key)

// verify that a blob is signed

   signed = verify(sig, public_key, blob)

// upload a file + metadata

   updload(id, tuple)

Write capability

Each file is associated with a private key (called the "signing key" in the paper), each update to that file must be signed with that private key. (note, the private key represents write access to the file, not the user). The private key is encrypted and stored with the file metadata.

Given a public-private key pair, the various capabilities are generated like this:

write_cap = hash(private_key)

verify_cap = hash(public_key)

//the read key is comprised of the hash(write_cap), and the verify_cap

read_cap = {hash(write_cap), verify_cap}

writing

To write a new file, the writer generates a key pair, encrypts the file contents, the private key, signs the encrypted file + metadata, and then uploads everything.

{public_key, private_key} = generate_pair()

write_cap = hash(private_key)

read_cap = hash(public_key)

verify_cap = hash(read_key)

salt = random()

cipherkey = encrypt(hash(write_cap + salt), private_key)

ciphertext = encrypt(hash(read_cap + salt), plaintext)

signature = sign(private_key, hash(ciphertext + salt))

cipher_public_key = encrypt(hash(verify_cap + salt), public_key)

upload(file_id, {cipherkey, ciphertext, salt, sig, cipher_public_key})

The purpose of encrypting with hash(read_cap + salt) is so that each version of the file is encrypted with a different key, which prevents against certain attacks. (this isn't mentioned specifically in the paper, I'm reading between the lines)

In the paper, the ciphertext is hashed with a merkle tree, so that it is possible to spread the file across multiple servers. However it's not necessary to consider this in order to understand the capability/permissions model.

verify

verification is also the first step in reading, so I'll explain it first.

the verify downloads the encrypted file + metadata and then verifys the public key and the signature.

{cipherkey, ciphertext, salt, sig, cipher_public_key} = download(file_id)

//reconstruct the public key

public_key = decrypt(hash(verify_cap + salt), cipher_public_key)

if(verify_cap != hash(public_key))
  throw INVALID

if(!verify(public_key, sig, hash(ciphertext + salt)))
  throw INVALID

if no exceptions where thrown, the file is valid.

reading

continue from the end of the verify step. reconstruct the key and then decrypt the file.

plaintext = decrypt(hash(read_cap + salt), ciphertext)

update

Finally, to update a file, the updater only needs to know the write_cap. As they can use it do decrypt the private_key

//we only care about the cipher key and the salt in this case.
{cipherkey, _, salt, _, _} = download(file_id)

//reconstruct the private_key
private_key = decrypt(hash(write_cap + salt), cipherkey)

salt2 = random()

ciphertext2 = encrypt(hash(read_cap + salt2), plaintext2)

signature = sign(private_key, hash(ciphertext2 + salt2))

upload(file_id, {cipherkey, ciphertext2, salt2, sig2, public_key})

further reading

The paper describes how TahoeLAFS uses multiple servers, and splits each file across multiple servers. This both improves robustness, but also means that a server can't return a stale version of a mutable file, because it won't agree with the other servers, assuming most of the servers are not in collusion!

Conclusion

Tahoe has a clever model for implementing permissions where the server is not actually in a position of authority. The clients can always verify that what the server gave it was correct, and clients also have the ability to delegate permissions to other clients without ever having rely on the servers as referee, other than to run the protocol correctly!

@scttnlsn
Copy link

The verify_cap should equal the hash of the hash of the public key, right?

if (verify_cap != hash(hash(public_key)))
  throw INVALID

@sasuke2690
Copy link

Not to mention spider solitaire 2 suit, one of the best vintage card games available today. The most captivating and compelling video game of 2022!

@drawgrunt
Copy link

Very clearly and useful for me. basket random

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment