Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created February 9, 2011 17:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PharkMillups/818829 to your computer and use it in GitHub Desktop.
Save PharkMillups/818829 to your computer and use it in GitHub Desktop.
7:15 <grourk> hi, we are considering various options for a block storage
system. basically, we want to store variable sized but largeish blobs
(100KB to 5MB). our application is such that blobs are written once and
infrequently read. in fact, we have about 10:1 writes to reads. the nearest
analog of what we're looking for is something similar to S3.. is riak geared
well for this kind of use case?
17:20 <pharkmillups> grourk: how are said blobs accessed?
17:22 <grourk> the blobs are chunks of video streams, each representing
about 1 minute.. the keys would essentially be time stamps into the stream.
not sure if that answers your question..
17:23 <grourk> so, when a given stream is accessed, the blobs would be accessed
sequentially. though, given that each blob is a minute of video, the
contiguity is probably not really important
17:23 <argv0> and you can precompute the keys?
17:23 <grourk> yes
17:24 <argv0> yeah, that sounds like it would work fine. for the larger
blobs you might want to benchmark straight K/V access against Luwak
(large file support on top of K/V)
17:25 <grourk> i've briefly looked at Luwak, and it seems that it's
essentially adding support for byte-range access.. i may be mistaken.
but for our purposes i think we can assume we'll read a whole block all or nothing
17:27 <argv0> actually if there's not a lot of write contention, KV should
be fine (didn't fully read your original question)
17:27 <grourk> ok.. so the fact that the blobs are relatively large won't be a problem?
17:28 <argv0> shouldn't be at 5MB or so - underneath there isn't any
streaming going on - so the value is loaded at the storage backend,
then sent to the request coordinator process, then sent to the client
17:29 <argv0> so at larger values this can put memory pressure on the
app since the whole value has to be materialized in RAM
17:31 <grourk> ok, that makes sense. thank you!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment