Created
January 6, 2011 21:04
-
-
Save warner/768580 to your computer and use it in GitHub Desktop.
Accounting notes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Ostrom Accounting | |
** apply Elinor Ostrom's eight "design principles" (of stable local | |
common pool resource [CPR] management) from | |
http://en.wikipedia.org/wiki/Elinor_Ostrom | |
- 1. Clearly defined boundaries (effective exclusion of external | |
unentitled parties); | |
- 2. Rules regarding the appropriation and provision of common | |
resources are adapted to local conditions; | |
- 3. Collective-choice arrangements allow most resource appropriators | |
to participate in the decision-making process; | |
- 4. Effective monitoring by monitors who are part of or accountable | |
to the appropriators; | |
- 5. There is a scale of graduated sanctions for resource | |
appropriators who violate community rules; | |
- 6. Mechanisms of conflict resolution are cheap and of easy access; | |
- 7. The self-determination of the community is recognized by | |
higher-level authorities; | |
- 8. In the case of larger common-pool resources: organization in the | |
form of multiple layers of nested enterprises, with small local | |
CPRs at the base level. | |
** starting point: know who (self-described) is using your storage | |
space. Know whose storage space you are using. Make this information | |
clearly visible to everyone involved (i.e. everyone knows that | |
everyone else knows, etc). | |
*** tahoe.cfg:storage gets some new flags: | |
- accounting=enabled | |
- this turns on the lease-owner DB. Existing shares are marked | |
'anonymous'. New shares that arrive through the old | |
RIStorageServer interface are labeled according to the TubID of | |
the other end of the connection. New shares that arrive through | |
the new RIAccountableStorageServer interface are labeled | |
according to the account under which that interface object was | |
created (see below). | |
- accounting=required | |
- this reads "storage-accounts.txt" for a list of accounts. Each | |
contains a pubkey, a petname, and maybe some additional | |
information (either local notes, or self-describing data sent by | |
the privkey holder) | |
- the RIStorageServer interface no longer accepts shares. Only | |
RIAccountableStorageServer accepts them. | |
*** tahoe.cfg:client gets some new flags | |
- actually it needs to be in private/ somewhere | |
- add a privkey. If present, clients will connect to | |
RIStorageServer, then attempt to upgrade to | |
RIAccountableStorageServer by sending a signed upgrade request | |
- clients do all their storage ops through the | |
RIAccountableStorageServer, which causes their shares to be | |
labeled | |
- RIAccountableStorageServer also includes get-my-total-usage | |
methods | |
*** the welcome page gets a new control panel | |
- not sure if it needs to be user-private or not | |
- storage-server panel: | |
- contains lists of accounts that are consuming your storage | |
- if accounting=required, add buttons to freeze/thaw the account, | |
cautious button to delete all shares | |
- client panel: | |
- contains lists of servers that are holding your shares | |
- combo "grid" panel: | |
- contains both, correlated | |
*** maybe broadcast channel of activity | |
- daily, maybe at first hourly digest of aggregate usage | |
- "Bob uploaded 62MB of data". "Alice downloaded 146MB of data" | |
- "Bob is currently using 3.5GB of storage space" | |
- "Alice is currently hosting 4.2GB of shares and has 0.8GB free" | |
- also include new-server, new-client events | |
- "Carol joined the grid, offering 3.0GB of storage space" | |
- "Dave invited Edgar to join the grid" | |
- and server-admin actions | |
- "Carol froze Bob's shares: dude, you're using too much" | |
- "David deleted Alice's shares: you unfriended me on facebook so | |
I'm deleting all your data" | |
- also generalized chat | |
- "Bob says: anyone up for pizza tonight?" | |
*** storage server needs a new crawler | |
- or the existing LeaseCrawler needs some new features | |
- shares contain canonical lease info, but local | |
who-is-consuming-what and remote get-my-total-usage methods need | |
pre-generated totals | |
- once usage DB is complete, new shares are added at time of upload | |
- but we must be able to generate/regenerate usage DB from just the | |
shares (er, just shares plus table of ownerid->account data, since | |
share.lease.ownerid field is too small) | |
- should I punt and go to SQLite for this? hard, given that the | |
share files are canonical: you could have a crawler that updates | |
the SQLite DB, then get usage info by doing a SUM(), but both feel | |
expensive. | |
- usage doesn't need to be super accurate. | |
- crawler can keep a separate table for each prefixdir | |
- 1024 * numusers | |
- tell crawler when a lease is added or removed, it +/- the number | |
from that table | |
- when the crawler cycles around, the count can be made accurate | |
- bouncing the server will lose the counting work done on the | |
current bucket, so it will need to restart. | |
*** RIStorageServer gets new upgrade method | |
- accepts a signed request, returns RIAccountableStorageServer facet | |
- request needs to be scoped correctly: server1 should not be able | |
to get Alice's facet on server2. Request should include serverid. | |
- if #466 lands, we can add new keys to the "storage" service | |
announcement. Redefine "FURL" to mean "anonymous | |
RIStorageServer", and add maybe.. "login.furl" to be the login | |
desk (no, I don't like "login" either). | |
- .login(request) -> RIAccountableStorageServer or error | |
- "request" is [msg, sig, pubkey] | |
- "msg" is JSON-encoded dict of: {serverid=base32serverid, | |
clientid=base32keyid} | |
- servers only accept requests that contain their own serverid, | |
and for which clientid matches the pubkey | |
- We'll add other fields to this later, for certchains or | |
transitive introductions or whatever. | |
- for transitive introductions, request may also contain | |
recommendations / certchains / introduction path | |
- upgrade method may fail when server doesn't like the client | |
- might be a temporary failure: the upgrade request might get | |
elevated to the storage server admin for approval. Might want "try | |
again later (at time=T)" response code. | |
- storage requests to RIAccountableStorageServer might fail if | |
server-admin freezes or cancels the account. get-my-total-usage | |
should keep working in many cases. | |
** step two is to make this easier to configure | |
- Invitations | |
- transitive introductions | |
- account managers | |
- pay-for-storage | |
- tit-for-tat | |
** step three is to resolve the issues that blocked us in the past | |
- repairer: who pays for the new share? | |
- sub-accounts, delegation, allmydata partners | |
- public webapi node: extending accounting beyond node and through | |
webapi/WUI: when Bob uses a public WUI, how can his shares be | |
counted against his quota instead of the webapi operator's? | |
** details | |
*** RIStorageServer upgrade method | |
- it gets you an AccountDesk object (need better name) | |
- ideally we'd expose this object directly in the announcement, | |
rather than going through the legacy RIStorageServer, but having | |
an upgrade method allows new-client/new-server/old-introducer | |
*** AccountDesk (now called "Accountant") | |
- stern accountant type: cold, uncaring, strictly follows the rules | |
- you must beg for access to your safe deposit box | |
- maybe a magic share-moving wand, but tagged with a color that | |
you can't scrape off | |
**** "please give me access to my account", maybe .login() | |
- returns Account object | |
- requires proof of ownership of an account | |
- input is ECDSA-signed(rxFURL). We expect rxFURL to be on the | |
client's tub, rather than being a gift, but that's not necessary. | |
getReference(rxFURL)->granted(account_object) | |
- account is based on ECDSA signing key, so login requires a data | |
transfer, which requires the FURL back-reference | |
- actual method is: | |
msg = JSON.encode({"please-give-me-access-to-my-account-v1": rxFURL}).encode("utf-8") | |
account = login(msg, sig, pubkey) | |
- (not safe to make it a raw signature: make a distinct purpose) | |
- error is either returned to login() as exception, or to rxFURL | |
as rejected() method | |
**** Account object | |
- includes RIStorageServer methods, but scoped to one account | |
- also includes additional methods | |
*** lease crawler | |
- we want efficient updates of a table mapping from ownerid to | |
(allshares, sizeof_allshares) | |
- but the canonical data for that table lies in the (flat) share | |
files. The shares can be changed externally. it must accomodate | |
startup (shares but no table) and spontaneous loss of the table. | |
- so table should be regenerated/refreshed periodically. we can | |
tolerate inaccuracy as long as the time is bounded. | |
**** sqlite lease tables with generation numbers | |
- CREATE TABLE leases (prefix, si, ownerid, size, generation) | |
- 'size' is denormalized but probably helpful | |
- maybe include more data about the lease: sharing factor, | |
expiration time | |
- CREATE TABLE usage (prefix, ownerid, totalsize) | |
- CREATE TABLE lease_generations (prefix, complete_gen, new_gen) | |
- when quiescent, new_gen=NULL | |
- queries work against (leases where generation = | |
lease_generations[prefix].complete_gen) | |
- new/updated leases are added/changed in both gen=.complete_gen | |
and gen=.new_gen . Deleted leases are removed from both. If | |
.new_gen=NULL, only use .complete_gen . Figure size delta against | |
.complete_gen and inc/dec usage[prefix,ownerid].totalsize | |
- when the crawler starts on prefix "aa": | |
- lease_generations[prefix].new_gen = .complete_gen+1 | |
- walk a chunk of shares, add lease data to .new_gen | |
- when prefix is done: | |
- update lease_generations[prefix].complete_gen = .new_gen | |
- lease_generations[prefix].new_gen = NULL | |
- DELETE leases[prefix, generation < new_gen] | |
- build set of ownerids used in this prefix | |
- foreach ownerid, sum usage across all leases in prefix, | |
update usage[prefix,ownerid] | |
- when Account wants usage: SUM usage[prefix=*,ownerid] | |
- when Account wants list of all shares, SELECT si FROM leases | |
WHERE lease=OWNERID AND generation = (SELECT complete_gen FROM | |
lease_generations WHERE lease_generations.prefix=leases.prefix) | |
- or something like that. Expensive, yeah. Cheaper if we all | |
deletions to lag and just use SELECT UNIQUE(si) FROM leases | |
WHERE lease=OWNERID : if they've deleted a share but the | |
crawler hasn't noticed yet.. | |
- oh, yeah, just do that. If the crawler has caught up, any share | |
deletions will also be removed from leases[] (both | |
generations), so we'll be good. If a share has been deleted | |
out-of-band (i.e. admin does 'rm SHARE'), we'll be wrong until | |
the next cycle. | |
** trying to accomodate future modes | |
- bitcoin, other payment schemes, reciprocity | |
- when accounting is enabled but permissive (measure-not-prohibit), | |
accounts are created on-demand. | |
*** use RIStorageServer.get_version() to advertise accounting support | |
- and if accounting/v1 is present, advertise specific modes | |
- Ostrom-accounting is a required part of accounting/v1 | |
- bitcoin is an optional feature | |
- accounting/v1 means do RIStorageServer.get_accountant() and then | |
forget about the initial (anonymous) RIStorageServer rref. Then | |
use the RIAccountant to get the RIAccount. Do everything else with | |
the account | |
- you can always get an account on demand, but it may not be able | |
to do anything. The server is not obligated to allocate anything | |
or remember your account until you e.g. add a lease. | |
*** RIAccount has some basic methods, maybe more if features are enabled | |
**** get_messages() -> dict | |
- these are messages that should be displayed to the user | |
- ["message"] should always be displayed: human-readable welcome or | |
warning message | |
- unrecognized keys are displayed, recognized keys are *not* | |
- e.g. {"bitcoin": "We accept bitcoin! See URL for details"} | |
- server's opportunity to teach client's user about new features | |
- if client knows about bitcoin, message is unnecessary | |
- message needs to be short (fits in small UI space). message | |
needs to be safe (no arbitrary HTML). maybe let each message | |
contain a small summary and a larger explanation, or a summary | |
and a URL (and hard-render the URL as a "learn more" link). | |
- maybe both ["warning"] and ["message"], display warning in red | |
**** get_status() -> dict {write:bool, read:bool, save:bool} | |
- both Ostrom-mode and BTC-mode share notion of account-status | |
- when all is well, clients can read and write as much as they | |
like | |
- if server admin gets annoyed, or they don't pay, account is | |
frozen: uploads are rejected but downloads are still allowed | |
- if they're really annoyed, downloads are rejected too, but | |
shares are retained | |
- ultimate punishment is to delete the shares | |
- (WRS) goes from TTT -> FTT -> FFT -> FFF | |
**** get_usage() -> dict {stored:int} | |
- sum of sizes of all shares on which this account has a lease | |
- should decide who pays for overhead, how it's recorded.. maybe | |
have two numbers | |
- leave room for other forms of usage | |
- in particular bandwidth: bytes in/out over last month (need way | |
to express time units) | |
**** get_bitcoin_data -> dict (only if bitcoin_v1 is advertised) | |
- price (BTC per byte-second, AWS $0.10/GB-month is 1fBTCpBS) | |
- current pre-paid balance | |
- lifetime of current usage (usage*price/balance=time) | |
- insert-coin address | |
- actually, maybe provide get_bitcoin_address() for this. If | |
bitcoin had a sort of "PO Number" label in the transactions | |
(which can be done by jamming an unused string into the | |
scriptSig, which would be tolerated by all clients, but it's | |
not standardized so clients don't provide APIs in or out), then | |
the sender could put their clientid in it, and receivers would | |
watch for transactions. | |
- but they don't. Easiest approach I can think of is to tell | |
each client a different bitcoin address, used only for them. | |
- Having a get_bitcoin_address() would let servers create | |
them lazily, on demand, rather than as soon as client | |
connects. Hm, on second thought, this doesn't win much. | |
- the Accountant needs to know how much BTC has arrived. It | |
only needs this when checking the books, so maybe once a | |
week. It could ask the bitcoin client for how much BTC is in | |
the owning BC-account, but eventually that BTC will be | |
transferred elsewhere. So really it needs a transaction log. | |
"bitcoind listreceivedbyaccount" might do it, but there is no | |
txnid field to distinguish between subsequent payments. | |
- anyways, there should be a payment dance, initiated by a | |
pay_bitcoin() method, which returns some information that needs | |
to be passed to the bitcoin client. Ideally the bitcoin txn is | |
included in the foolscap message, so the receiver can validate | |
it right away. If not, the receiver makes a persistent note to | |
expect the txn, and starts polling the bitcoin client for the | |
money. | |
**** set_nickname | |
- provide Ostrom-mode data about ourselves to the server | |
- meant for a human to see and consider | |
- not trusted beyond the Ostrom sense | |
**** usual share methods: allocate_bucket, get_bucket, add_lease | |
** known-storage-servers UI page | |
- this is on the client, showing the servers it knows about | |
*** each storage server row has a field for the common properties | |
- server message | |
- write/read/save status | |
- current usage | |
*** specific accounting modes provide additional fields/columns | |
**** when bitcoin is present on both sides: | |
- show price, current balance, lifetime | |
- show "pay for storage" field, with suggested BTC amount and | |
"Spend!" button | |
- if sending BTC doesn't increase quota right away, the field needs | |
to provide a pacifier message | |
- show previous payment history | |
**** reciprocity: show space their client is using on us | |
- blurs line between them-as-servers and them-as-clients. The | |
read/write/save mode we enforce on them should probably be | |
displayed next to the space they're using, and then the | |
freeze/thaw/delete buttons should go there too. And the bandwidth | |
they've consumed. Tricky. | |
*** show clientid here, so you can copy-and-paste it to servers | |
- like when you send them an email saying "please let me store | |
shares on your server" | |
** reciprocity | |
*** storage servers advertise the clientid that they benefit | |
- Bob's storage server will advertise Bob's clientid | |
- the rent-a-friend paid server will advertise a clientid for | |
whoever hired them: when Bob pays for the server, Bob's client | |
gets the benefits. | |
- not sure how to share a server between multiple friends: maybe | |
the advertisement should say 30% Bob, 40% Alice, etc. | |
*** client-side StorageFarmBroker tracks client usage | |
- it remembers that it has stored 2GB on server A | |
*** Accountant asks StorageFarmBroker for reciprocity benefit | |
- when considering how to treat client A, it asks broker about | |
clientid A | |
- broker checks all the servers it knows it is using, finds ones | |
that benefit clientid A, adds our usage on them, reports total to | |
Accountant | |
- Accountant deducts reciprocity benefit from client's total when | |
deciding if they're overquota or not | |
** DEV PLAN | |
*** DONE clientkey generation | |
*** DONE connection upgrade, signed rxFURL | |
*** DONE Account wrapper | |
*** DONE real Accountant.get_account(): need table of accountid->ownernum | |
- must be persistent | |
*** DONE send nickname to Account, other identifying data | |
*** lease crawler, db update, space totaler | |
*** DONE fake space-totaler numbers | |
*** DONE server-side space-consumed-per-client-account status display [5/5] | |
**** DONE nickname | |
**** DONE clientid | |
**** DONE current connection status (from address) | |
**** DONE last-heard-from time | |
**** DONE first created | |
*** DONE RIAccount get-space-i'm-using methods | |
*** DONE client-side retrieve-space-i'm-using, add to webui status display | |
*** DONE client-side show-status webui | |
*** DONE client-side show-server-message webui | |
*** client-side Account object, push message to it, instead of polling | |
when rendering status WUI page | |
*** server-side accounting controls: accounting=required, list of pubkeys | |
*** server-side status/control webui: "freeze" button | |
*** clean up client-side which-servers-i'm-using webui display | |
- add last-share-refused notice, and size of the request that was | |
rejected. This gives you an idea of which servers are full. Greg | |
Troxel's suggestion from the list, 04-Jan-2011. | |
*** then start playing with fake bitcoin controls | |
*** settle on format for client keys | |
- currently pub-v0-bo7uxkjfuu4zpqpfmzsknogorrvnyfxgs5nc776ddsbaoxswqt47owmnyof75jbzi6zr74cb4hoos | |
- might want to trim "pub-v0-" | |
- might want to add "client-pub-v0-" instead? | |
*** client-side usage tracker: hard | |
*** rearrange patches to move util.keyutil into #466 | |
*** decide about client.key and server.key (same? separate?) | |
**** change to use NIST256p, not 192p | |
** Tahoe leasecrawler | |
[2010-12-29 Wed 20:59] | |
- take prodnet share catalog, turn into sqlite db, check performance | |
- sum size per ownernum | |
- total size. Guessing 1M shares: 40MB plus index | |
- pretend 1k owners. Assign each share 1.11 owners (all get 1, 10% | |
get 2, 1% get 3) | |
- (SI, ownernum, size) | |
- index on SI, index on ownernum | |
- Crawler: at start of each prefix, remove any db rows for which | |
there are no shares | |
- for each on-disk bucket, add/remove rows to match disk | |
- do all space-used-per-ownernum calculations on demand, from db | |
- db is derived from shares. Manually deleting a share will cause | |
wrong space usage number until crawler comes around. Same for | |
adding a share. | |
- for friendnets (few ownernums), getting size of all accounts is quick | |
- for prodnet (lots of ownernums), this could be expensive. Consider | |
using a separate thread, or separate process (account-manager | |
process, interacts through db and fs) | |
- move lease-expiration duties out of thr crawler over to the db. Do | |
the expiration check on each prefix just after it finishes. Maybe | |
add "expires-at" column to db, add a status display showing | |
histogram of share ages. | |
- do lease expiration manually via control panel. Panel shows | |
histogram of shares, ages, cutoff threshold as vertical line, | |
button to delete everything left of line. This would be a better | |
way to explain the current summary text on the lease-crawler page. | |
- db coherence: all client-triggered share ops touch both file and db | |
in the same turn. Crawler: at start of prefix, compute | |
set(os.listdir)-set(db) and remove those from db in the same turn. | |
Then over subsequent turns, scan each bucket and make db match. | |
That should maintain coherence. | |
- include flag to say whether all prefixes have been scanned at least | |
once. Status display should clearly say "incomplete" until this | |
flag is set. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment