Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created July 2, 2010 13:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PharkMillups/461387 to your computer and use it in GitHub Desktop.
Save PharkMillups/461387 to your computer and use it in GitHub Desktop.
kstt # hi, reading the riak mailing list is really interesting, and the "riak recap"
took me here we definitely need a resiliant key/value store, but we need it for
opaque files blobs of 100 MB or so. I must admit I didn't digg into the source code
of riak, but I'm surprised by the memory requirement during replication. May you point
me to some design paper where I can understand the reason behind it please ?
justinsheehy # kstt: right now, use of "just" Riak for 100MB blobs won't work very well.
however, there will be a "large files" extension available soon.
justinsheehy # most systems similar to riak aren't optimized for larger single units of
data, but there are lots of ways of making up for that. we've got one coming. np
kstt # will this extension take care of splitting big-blobs in small-blobs and
joining them back ?
seancribbs # kstt: no, but you will be able to select ranges of bytes
(sorry if I jumped in there too soon, justinsheehy)
justinsheehy # no worries, sean
the basic idea is a layer atop riak, using riak as a block store
appearing to the client like you can just stream up or down whole blobs
kstt # how much "secret" is the planned release date for this component ? :)
btw the paper on dynamo is a good reading
seancribbs # kstt: not really secret, just not assigned a release date
kstt # After some more reading, I still don't understand why, technically,
big blobs are causing troubles
seancribbs # kstt: it's primarily buffer sizes, and internal limitations of the Erlang VM
kstt # seancribbs: you mean a coordinator node can't stream the data due to some
limitation in the erlang VM ?
justinsheehy # it's not that you couldn't possibly do that
seancribbs # we use Erlang messages to send riak objects back and forth around
the cluster
* seancribbs # lets justinsheehy take this one
justinsheehy # heh sure but that a number of elements of riak assume that you
can assemble the whole riak object as a complete term, for checksumming,
versioning, etc and sending that term around and treating it as a unit
kstt # ok
justinsheehy # so, things get ugly if you try to do that with large values.
we could have done streaming, but then a number of the internal protocols would be
much more complicated, managing partial failure of individual updates and so on
it turned out to work out better instead to use riak objects as blocks in a higher
abstraction for large blobs so, a hash tree of those works out quite nicely and is
how we'll be providing large file support you won't be exposed to the internals,
but since you're asking.
kstt # I see, thank you for the explanation
justinsheehy # from a user point of view, it will be streaming uploads and downloads.
internally, it'll be blocks of data.
kstt # how big, so ?
justinsheehy # configurable
kstt # you probably know better than the user how big it should be
justinsheehy # configurable block size. effectively "unlimited" object size.
that doesn't mean really unlimited, of course, but gigabytes should be no problem.
yes, there will be sensible defaults. we're experimenting with the best settings for that.
kstt # I was asking about the size of the blocks themselves, just out of curiosity
regarding the internals of riak (I'm late, I typed slowly)
justinsheehy # I've played with it at a few different sizes. not sure yet what
the defaults will end up at.
kstt # Ok, so I'll just wait and see
I must say I'm really glad to see the momentum of riak and its team
kstt # We have been wanting such a smart data storage for a while at our company
seancribbs # kstt: that's great! contact us anytime, we're happy to help
kstt # and, hum, others did not give enough confidence
I noticed, thank you very much for the detailed explanation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment