Created
November 22, 2011 00:08
-
-
Save aphyr/1384430 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10:11 <seancribbs> aphyr_: the internal dilemma - to encode or not to encode | |
10:11 <aphyr_> Seems like Riak should be content-agnostic for keys | |
10:12 <aphyr_> The encoding should be protocol-dependent and reversed once past | |
the HTTP/protobufs interface | |
10:12 <aphyr_> I'm running out of special characters to use as separators, haha | |
10:13 justinsheehy joined | |
10:13 amerine joined | |
10:14 <aphyr_> Dashes appear in datetimes... | |
10:14 <seancribbs> aphyr: yeah I'm on the fence about that. | |
10:15 <seancribbs> in one sense, you want keys sent via pbc to be available on h | |
ttp | |
10:15 <seancribbs> on the other hand, http severely restricts what "valid" keys | |
are | |
10:15 <aphyr_> Naw, keys sent over HTTP should be url-encoded | |
10:16 <aphyr_> Problem solved! | |
10:16 <aphyr_> Real key: "1,2". Protobufs: "1,2". HTTP: "1%3A2" | |
10:16 <seancribbs> O.o | |
10:16 russelldb joined | |
10:17 <aphyr_> No seriously, putting arbitrary strings into HTTP URIs is a solve | |
d problem. | |
10:17 <seancribbs> yes, the question is really about how much Riak should be con | |
cerned with that | |
10:17 <seancribbs> or whether to make client libs solve it | |
10:17 <seancribbs> there are convincing arguments both ways | |
10:18 <aphyr_> Riak presents an HTTP interface. It should un-url-encode strings | |
in URI fragments and treat them as binary internally. | |
10:18 <aphyr_> Clients are responsible for encoding data for HTTP as necessary. | |
10:19 zerosanity joined | |
10:19 mattrepl joined | |
10:19 <aphyr_> I dunno, has any other HTTP API ever done something different? | |
10:22 <aphyr_> Hell, I presume riak is already doing this in other places in its | |
HTTP API. Inline mapreduce, for example, is transmitted over the wire URI-encod | |
ed. | |
10:23 <aphyr_> Sorry, just freaking out because this, erm, behavior caused some | |
major data corruption last night. | |
10:23 <strmpnk> aphyr_: agreed. not decoding creates some incompatible cases lik | |
e making it easy to create keys that can't be accessed over HTTP. | |
10:23 <aphyr_> I'm happy to submit a patch if you guys will consider it. | |
10:28 <seancribbs> aphyr_: yes, we just need a clear story of how the problem o | |
ccurred and why the fix is appropriate (and not too far-reaching) | |
10:29 <aphyr_> Sure. I used commas as separators for my keys because _ and - are | |
used in some of the key components already. | |
10:30 Kenstigator joined | |
10:30 <aphyr_> Ripple was perfectly happy to write and read these keys as "1,2", | |
but internally stored values like "1%3A2" | |
10:31 <aphyr_> My erlang and JS mapreduce jobs, meanwhile, were producing result | |
s like "1%3A2", which were then used as input to riak-client to fetch/store item | |
s. | |
10:31 <seancribbs> it might also be a matter of using URI instead of CGI | |
10:31 <seancribbs> that debate I'm also on the fence about | |
10:31 sfalcon joined | |
10:32 <aphyr_> As you can imagine, ripple happily re-uri-encoded "1%3A2" to "1%2 | |
53A2" | |
10:32 <seancribbs> ah, right. you don't want to double-encode | |
10:32 <aphyr_> And shit proceeded to hit the fan | |
10:32 <aphyr_> There is no conceivable case I can envision where a user would wa | |
nt their input on the HTTP wire to be treated as literally encoded. | |
10:33 <aphyr_> Just adding url:decode around the key would a.) make every key ac | |
cessible and b.) prevent confusion over key names. | |
10:34 <aphyr_> Also, I think this URI-encoding strategy might break links. | |
10:34 <aphyr_> There's also the fact that every single HTTP API I have ever enco | |
untered unencodes its input. :) | |
10:34 moonpolysoft joined | |
10:35 <aphyr_> This *would* cause backwards incompatibility for users who are cu | |
rrently using un-url-safe strings over the HTTP interface. | |
10:36 siculars joined | |
10:36 <aphyr_> But really, if you're using un-url-safe strings as keys over HTTP | |
right now, you probably need to reconsider anyway. :) | |
10:37 <aphyr_> Probably best to make the switch sooner rather than later, to min | |
imize disruption. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment