Created
July 2, 2010 13:51
-
-
Save PharkMillups/461387 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kstt # hi, reading the riak mailing list is really interesting, and the "riak recap" | |
took me here we definitely need a resiliant key/value store, but we need it for | |
opaque files blobs of 100 MB or so. I must admit I didn't digg into the source code | |
of riak, but I'm surprised by the memory requirement during replication. May you point | |
me to some design paper where I can understand the reason behind it please ? | |
justinsheehy # kstt: right now, use of "just" Riak for 100MB blobs won't work very well. | |
however, there will be a "large files" extension available soon. | |
justinsheehy # most systems similar to riak aren't optimized for larger single units of | |
data, but there are lots of ways of making up for that. we've got one coming. np | |
kstt # will this extension take care of splitting big-blobs in small-blobs and | |
joining them back ? | |
seancribbs # kstt: no, but you will be able to select ranges of bytes | |
(sorry if I jumped in there too soon, justinsheehy) | |
justinsheehy # no worries, sean | |
the basic idea is a layer atop riak, using riak as a block store | |
appearing to the client like you can just stream up or down whole blobs | |
kstt # how much "secret" is the planned release date for this component ? :) | |
btw the paper on dynamo is a good reading | |
seancribbs # kstt: not really secret, just not assigned a release date | |
kstt # After some more reading, I still don't understand why, technically, | |
big blobs are causing troubles | |
seancribbs # kstt: it's primarily buffer sizes, and internal limitations of the Erlang VM | |
kstt # seancribbs: you mean a coordinator node can't stream the data due to some | |
limitation in the erlang VM ? | |
justinsheehy # it's not that you couldn't possibly do that | |
seancribbs # we use Erlang messages to send riak objects back and forth around | |
the cluster | |
* seancribbs # lets justinsheehy take this one | |
justinsheehy # heh sure but that a number of elements of riak assume that you | |
can assemble the whole riak object as a complete term, for checksumming, | |
versioning, etc and sending that term around and treating it as a unit | |
kstt # ok | |
justinsheehy # so, things get ugly if you try to do that with large values. | |
we could have done streaming, but then a number of the internal protocols would be | |
much more complicated, managing partial failure of individual updates and so on | |
it turned out to work out better instead to use riak objects as blocks in a higher | |
abstraction for large blobs so, a hash tree of those works out quite nicely and is | |
how we'll be providing large file support you won't be exposed to the internals, | |
but since you're asking. | |
kstt # I see, thank you for the explanation | |
justinsheehy # from a user point of view, it will be streaming uploads and downloads. | |
internally, it'll be blocks of data. | |
kstt # how big, so ? | |
justinsheehy # configurable | |
kstt # you probably know better than the user how big it should be | |
justinsheehy # configurable block size. effectively "unlimited" object size. | |
that doesn't mean really unlimited, of course, but gigabytes should be no problem. | |
yes, there will be sensible defaults. we're experimenting with the best settings for that. | |
kstt # I was asking about the size of the blocks themselves, just out of curiosity | |
regarding the internals of riak (I'm late, I typed slowly) | |
justinsheehy # I've played with it at a few different sizes. not sure yet what | |
the defaults will end up at. | |
kstt # Ok, so I'll just wait and see | |
I must say I'm really glad to see the momentum of riak and its team | |
kstt # We have been wanting such a smart data storage for a while at our company | |
seancribbs # kstt: that's great! contact us anytime, we're happy to help | |
kstt # and, hum, others did not give enough confidence | |
I noticed, thank you very much for the detailed explanation |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment