[2:20pm] felixge: ^--- would love if you could explain ideas on how to do it here
[2:20pm] felixge: but ideally it'd be an optional part of the protocol
[2:21pm] felixge: Baughn: <3 thanks!
[2:22pm] Baughn: felixge: The problem is, TCP uses a 16-bit checksum. The algorithm is resilient against single-bit errors - the checksum should always change - but for multiple-bit errors, there is a 2^-16 chance you get the /same/ checksum.
[2:22pm] felixge: yes, we need to figure it out
[2:22pm] Baughn: And you're aiming this protocol squarely for noisy networks.
[2:22pm] felixge: a few thoughts: clients may not always be able to generate a checksum, and doing this for large files can be problematic (especially in JS)
[2:22pm] Baughn: Right, I'd mark that part of the protocol with SHOULD.
[2:22pm] felixge: so a good proposal should allow checksums for each PUT, rather than just the entire file
[2:23pm] Baughn: "Entire file" is easy to implement, and covers most use cases. Per-put has some simplicity advantages, but I don't think it's very useful overall..
[2:23pm] felixge: another use case that was brough up before is de-duplication: A server should be able to say he already has a certain chunk if the checksum matches
[2:23pm] Baughn: There's also the checksum tree (aka. Bittorrent) option, which is /great/
[2:23pm] felixge: Baughn: fair enough, I'm looking forward to the comments
[2:23pm] felixge: not aware of checksum trees
[2:23pm] Baughn: Ah yes, this would also allow for de-duplication.
[2:23pm] felixge: so I'd <3 your help coming up with a good solution
[2:24pm] felixge: and I agree that it's probably important enough to be a SHOULD in the protocol
[2:24pm] Baughn: I'd recommend CRC32 or CRC64. Preferably the latter.
[2:24pm] Baughn: There's no point in specifying a cryptographic checksum.
[2:25pm] Baughn: Those two are fast enough to work nicely in JS, too.
[2:25pm] Baughn: (Or js.asm, heh)
[2:25pm] sandfox: what about content-md5 header?
[2:25pm] Baughn: Oh, huh. That exists.
[2:25pm] sandfox: clients can send that with an upload and then server can verify on completion
[2:25pm] sandfox: i think amazon s3 uses it
[2:25pm] Baughn: Content-MD5 sounds good
[2:26pm] Baughn: Checksum trees are a potential future enhancement, then..
[2:27pm] Baughn: felixge: Also, have a look at how Bittorrent works. Obviously we don't want to clone it wholesale, but its checksumming features are nice.
[2:27pm] sandfox: dammit you edit issues quickly Baughn:
[2:27pm] Baughn: It's that or finish this privacy training.
[2:28pm] sandfox: understandable…
[2:30pm] sandfox: Baughn: were you thinking some kind scenario where once an 'upload' is created, multiple (distributed) clients could begin uploading segments at the same time or at least in a fairly un-coordinated fashion?
[2:30pm] Baughn: No
[2:30pm] Baughn: I just meant to point at bittorrent's implementation of checksum trees
[2:30pm] Baughn: http://en.wikipedia.org/wiki/Hash_tree <- Though it's pretty simple, really.
[2:31pm] sandfox: Just me then… already have some mad science ideas brewing
[2:31pm] Baughn: Well, I'd rather extend bittorrent for that purpose than make anything new.
[2:32pm] sandfox: thats probably more sensible, but I have never let that be barrier
[2:33pm] • Baughn has seen some *hilarious* postmortems caused by random corruption, including on-wire corruption. TCP and all.
[2:33pm] Baughn: felixge: To be quite blunt, including checksumming makes this protocol far more likely to be accepted.
[2:39pm] sandfox: As far as I can tell content-MD5 is only correct in the context of a individual http request, so it would make sense for PUTs but we'd need some other header else for initial POST. thoughts?
[2:39pm] felixge: Baughn: sure, lets do it
[2:41pm] Baughn: Make up our own header. 'Resource-MD5' sounds about right.
[2:41pm] tim-kos:
[2:41pm] Baughn: Which is added by the client on POST, and the server on HEAD/GET
[2:42pm] Baughn: ..thus allowing for de-duplication both ways
[2:43pm] Baughn: Oh, right. Default implementations should set ETags to the same value.
[2:43pm] Baughn: ..but this isn't really a file /server/, so that's a separate issue.
[2:44pm] Baughn: ..tim-kos. Where did I see that name before...
[2:44pm] tim-kos: not sure, nodecopter?
[2:45pm] Baughn: Nope
[2:45pm] Baughn: Ever been to #bay12games, by any chance?
[2:45pm] sandfox: ETags always seem a little wishy-washy and vague but that just imho. Also - good reading about content-md5 :: http://trac.tools.ietf.org/wg/httpbis/trac/ticket/178
[2:46pm] Baughn: Right, Resource-MD5 it is.
[2:47pm] felixge: Baughn: Resource-MD5: Should we use an X- Header?
[2:47pm] felixge: or is that no longer recommended
[2:47pm] felixge: I remember reading sth about that
[2:48pm] Baughn: I'm not terribly familiar with the HTTP process, sorry..
[2:48pm] Baughn: I'd prefer to not use an X- prefix, just so if it gets to be an official RFC we don't have to change anything
[2:49pm] Baughn: ..and yeah, Resource-MD5 should probably be a separate RFC from tus.
[2:49pm] Baughn: ' On June 2011, the first IETF draft was posted to deprecate the use of the "X-" prefix for non-standard headers. The reason is that when non-standard headers prefixed with "X-" become standard, removing the "X-" prefix breaks backwards compatibility, forcing application protocols to support both names (E.g, x-gzip & gzip are now equivalent). So, the recommendation is to just name them sensibly without the "X-" prefix.'
[2:49pm] Baughn: IETF agrees.
[2:49pm] Baughn: http://tools.ietf.org/html/rfc6648
[2:50pm] felixge: Baughn: awesome
[2:50pm] felixge: I think we need to restructure the doc a little
[2:50pm] felixge: into the "Core" protocol
[2:50pm] felixge: and the various optional parts
[2:50pm] felixge: so that implementors can get a compatible implementation quickly
[2:50pm] Baughn: Have you done this before?
[2:50pm] felixge: and add the other features as needed
[2:50pm] felixge: Baughn: done what?
[2:50pm] Baughn: Written an rfc.
[2:50pm] felixge: no
[2:51pm] felixge: this isn't an RFC so
[2:51pm] Baughn: Eh, yes it is.
[2:51pm] felixge: but maybe we'll submit it one day
[2:51pm] Baughn: Or it should be.
[2:51pm] felixge: but for now that'd be a distraction
[2:51pm] felixge: for now we need to figure out what we want/need
[2:51pm] Baughn: It just means "request for comment", and you're doing /that/ fine.
[2:51pm] felixge: in order to bring resumable file uploading the web
[2:51pm] felixge: Baughn: there are actually a lot of editorial rules for writing RFCs
[2:51pm] felixge: I've researched it
[2:51pm] Baughn: Point is, there's a tendency for first-time RFC writers to try following the form, rather than the spirit, of existing ones. The only thing that /really/ matters is that the document is clear and easy to follow.
[2:52pm] Baughn: ..I am, of course, talking about golden-age RFCs. Not the mess we have todya.
[2:52pm] felixge: : )
[2:55pm] sandfox: Is there any mileage in supporting 'Range' header for HEAD requests so partial chunks can verified by a client before attempting a re-upload
[2:56pm] Baughn: Define 'verified'
[2:58pm] sandfox: good point, client sends HEAD request with byte range in "Range" header, server responds with content-MD5 for that range so client can see if it matches what it thinks should be there.
[2:58pm] Baughn: I'd call that a kluge. Hash trees are the correct solution..
[2:59pm] Baughn: Oh, it'd /work/, but it'd be a lot more expensive
[2:59pm] Baughn: Although..
[3:00pm] Baughn: If it's just for resuming then, yeah, it's reasonable. The question is what to do if the range doesn't match.
[3:00pm] Baughn: And with hash trees, you'd get all of this for free including how to fix damage.
[3:01pm] sandfox: How easy/difficult are hash tree's to do from a client side perspective
[3:01pm] Baughn: (Plus the server would need a hash tree anyway, to efficiently compute the range MD5. You do not want to actually compute an MD5 from scratch on every HEAD, that'd be hilariously DoS-vulnerable.)
[3:01pm] Baughn: Hash trees? Easy-peasy.
[3:03pm] Baughn: You can even do things like leave out the topmost levels (compute only the bottom N levels of the tree), which lets you stream data to the server without computing the hash in advance
[3:03pm] Baughn: Which would, in turn, allow for live video uploads and whatnot
[3:04pm] Baughn: Start with 64KiB chunks, compute 4 levels, and the client only needs to buffer 1MiB at a time
[3:04pm] sandfox: I suppose the only thing in my mind is that "Range" + HEAD supports arbitrary byte ranges (I admit it is overhear in lots of ways), what block size can/does hash trees sensibly do?
[3:04pm] Baughn: While the server can fill in the higher tree at will
[3:05pm] Baughn: Well, 64KiB chunks are pretty common - that gives you ~0.3% overhead.
[3:06pm] sandfox: Hmmm, I'm feeling that hash trees represent a much more solid approach overall and that at best Range + HEAD could be an optional thing, if not left out altogether for people to implement on top if they so choose
[3:06pm] Baughn: You'd use Range+HEAD to implement it, though..
[3:07pm] Baughn: I mean, okay. Scenario:
[3:07pm] sandfox: is it possible to restrict Range requests to block sizes?
[3:07pm] Baughn: It's not necessary. So. Scenario:
[3:07pm] Baughn: 1MiB file. 64KiB chunk size.
[3:08pm] Baughn: The client starts by transferring the entire hash tree to the server. As I said, it /could/ stream, but this is a simple scenario.
[3:08pm] Baughn: Then it transfers, say, 700 bytes before the connection fails.
[3:08pm] Baughn: ..er, 700000.
[3:09pm] Baughn: That's 10 full chunks, plus a bit.
[3:09pm] Baughn: The server can immediately verify those chunks, and see that they're correct. The "plus a bit", it can either discard or keep.
[3:10pm] Baughn: When the client reconnects, it gets a range request from the server, asking it to start at byte 700,001. (If we keep everything)
[3:10pm] Baughn: Assuming the connection doesn't break a second time, the server can then go ahead and compute the hash for chunk 11 through 16
[3:11pm] sandfox: ok, that make sense
[3:12pm] Baughn: The server *may* need to finish the conversation by returning a 4XX Transmission Corrupted, in which case the client would re-attempt its upload until it /isn't/ corrupted, sending different chunks each time
[3:12pm] sandfox: ah-ha, I was thinking in my examples of the client sending a range header. but this isn't needed because the client gets the necessary info on the HEAD request
[3:12pm] Baughn: Now, there's nothing in that that required a *tree*; it's all just 64KiB chunks
[3:13pm] Baughn: The 'tree' part of the specification is there to make it easy to change the chunk size, essentially
[3:13pm] sandfox: sounds legit
[3:14pm] Baughn: There are all kinds of clever things you /can/ do, like starting by just transferring the top-most hash level (of the entire file), and requesting more levels if corruption is noticed so you can find the exact spot, but those all make things more complex
[3:15pm] Baughn: Still, if you start with the 'hash tree' abstraction, all of the things you /might/ want to do suddenly have simple descriptions in terms of the hash tree structure
[3:15pm] sandfox: would there need to be negation between client and server over the tree details?
[3:15pm] Baughn: Not if the protocol is specified correctly
[3:16pm] Baughn: It's not hard to write a client and server that just work with every possible tree
[3:16pm] Baughn: ..though of course there should be limits, i.e. 1B chunk sizes probably represent a DoS attempt
[3:18pm] Baughn: My main worry is that hash trees are best serialized as heaps, and /not/ ascii
[3:18pm] Baughn: The text blowup factor for doing this the recommended HTTP way could be hilariously large
[3:19pm] sandfox: does HTTP have any views at all on that?
[3:19pm] Baughn: It's traditional to make everything human-readable. There are good reasons to do so.
[3:20pm] Baughn: But it also increases data usage, under already tight circumstances
[3:21pm] Baughn: That seems to be changing, though..
[3:21pm] Baughn: http://tools.ietf.org/id/draft-snell-httpbis-bohe-01.html
[3:23pm] Baughn: Most uploads will not be corrupted..
[3:23pm] Baughn: I think it'd make sense to (when not streaming) transfer just the root node, then negotiate if corruption happens
[3:23pm] Baughn: There isn't really any reason the server needs to know the location of corruption before the entire file is done.
[3:23pm] Baughn: ...
[3:24pm] Baughn: Well, except one - it'd reduce roundtrips, we could pipeline patch requests.
[3:24pm] Baughn: And, oh yeah.
[3:24pm] Baughn: itu should use the PATCH verb instead of PUT.
[3:25pm] sandfox: whats browser support like for PATCH?
[3:25pm] Baughn: Supported in everything newer than IE8
[3:25pm] Baughn: (And lots of things older, but eh)
[3:26pm] Baughn: It's a bit pedantic, given that things would work fine either way.
[3:27pm] sandfox: make it optional to support PUT as a fallback?
[3:27pm] Baughn: I was thinking the same.
[3:27pm] Baughn: I'm not 100% sure what the practical consequences of PUT vs. PATCH are
[3:28pm] Baughn: PATCH has the right /semantic/ meaning, but there might be things that assume more than we'd like. About either method, really.
[3:30pm] sandfox: do you want to put the summary of this in the github issue?
[4:00pm] felixge: Baughn: bah, got to finish some slides right now, but will read up on all the stuff you wrote later!
[4:00pm] felixge: thx for thinking about this!
Created
April 17, 2013 15:19
-
-
Save sandfox/5405175 to your computer and use it in GitHub Desktop.
tus.io IRC conversation about content checksums and verifying transmitted data across.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment