Skip to content

Instantly share code, notes, and snippets.

@warner
Created December 16, 2014 19:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save warner/66b604617d307374fe17 to your computer and use it in GitHub Desktop.
Save warner/66b604617d307374fe17 to your computer and use it in GitHub Desktop.
Some text from an etherpad about the "servers of happiness" docs, for ticket #1382
pasted to IRC last week:
Suppose your upload achieves a happiness count of C, and your encoding is configured to need a minimum of K shares (aka "shares.needed"). Then immediately after a successful upload, we know we can tolerate the loss of any C - K servers.
We provide shares.happy as a parameter to the upload process so that it knows whether it was successful or not: if the uploader cannot place shares to meet this C >= shares.happy threshold, it aborts the upload, rather than finishing with an insufficiently robust placement. There is another parameter, shares.total (aka "N"), which provides a goal: the uploader will attempt to place N unique shares. But as long as it can achieve shares.happy, it will indicate success.
Another way to state this property is that for *any* subset of shares.happy servers that have shares after a successful upload, only K need to retain them.
Suppose your upload achieves a happiness count of C, and your encoding is configured to need a minimum of K shares (aka "shares.needed"). Then immediately after a successful upload, we know we can tolerate the loss of any (C - K) servers.
Another way to state this property is that for *any* subset of shares.happy servers that have shares after a successful upload, only K need to retain them.
We provide shares.happy as a parameter to the upload process so that it knows whether it was successful or not. "Successful" means the uploader was able to place enough shares, in the right places, to meet this C >= shares.happy threshold. If it cannot achieve this definition of success, it aborts the upload, rather than finishing with an insufficiently robust placement.
There is another parameter, shares.total (aka "N"), which provides a goal: the uploader will attempt to place N unique shares. But as long as it can achieve shares.happy, it will indicate success.
tolerance to missing server during upload: S-H
tolerance to missing server during download: at most H-K
max storage used: N/K * filesize
min storage used: H/K * filesize
THE SIMPLIFIED MODEL:
Tahoe nominally stores exactly one share on each server. In this case, ...
There is a parameter, `shares.needed`, which is how many shares of the file are needed to reconstruct your file. The next parameter, `shares.total`, is how many servers the upload will try to send a share to. The next parameter, `shares.happy`, is how many servers the upload will be satisfied with — if it can't send a different share to each of at least `shares.happy` servers then it will abort the upload.
However, sometimes you will wind up with multiple shares on a server (perhaps because an earlier upload put them there, or the repair algorithm found some pre-existing shares, or because you don't have very many servers). In this case, the simple metric of "how many servers have shares" is not sophisticated enough.
The rest of this document explains a more sophisticated metric called "shares of happiness".
The real upload algorithm is more complicated than that because it needs to handle the case that it finds out some servers already have some shares (or even some servers have more than one share), while it is uploading. However, it is basically still trying to accomplish the same goal as the above.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment