johne # I have some questions about limits and was hoping someone could help
me out. If I understand correctly, riak does not impose a limit on the number
of buckets, but the chosen backstore may impose some practical limits. Is this correct?
arg # pretty much, yes
johne # There appears to be a similar situation with the number of keys.
arg # innostore currently creates a file per bucket/partition combo
but all other backends use one file per partition unless you really want
innostore, we recommend you use bitcask one other thing with buckets: buckets
dont consume any resources as long as they use the bucket defaults - either
the stock riak defaults or ones you set in your app.config buckets that
change some of those defaults take up a small amount of space in the ring
data structure that's gossiped around
johne # ok, good to know
arg # number of keys, not so much
johne # does bitcask use only a single active data file for all buckets?
arg # bitcask does keep a small amount of metadata for each key in RAM
it uses a single data file for each partition
many bucket/key pairs can reside in a single partition
johne # It looks like bitcask limits the number of keys by the size of the
keydir, which must be memory resident? That would be key hash + key metadata?
arg # yeah it keeps in memory just key hash + file id + offset
we have some ideas on how to compress that further
johne # How big would that be? what is the size of each field there
arg # i think its around 32 bytes per key 20 bytes hash + 4 bytes fileid + 8 bytes offset
johne # ok, thanks that is very helpful
arg # could be 4 bytes offset, not sure
johne # Any additional overhead for keydir?
arg # some small fixed overhead for the hash data structure
but nothing to worry about if you're doing back-of-the-envelope calculations
all that stuff is in relatively optimized C code
johne # I am hoping to go beyond back-of-the-envelope....
arg # IOW the overhead of the container for that metadata wont push you over the edge if you do calculations based strictly on the per-key overhead
johne # I am looking a migrating an app that currently is using a RDBMS. Current limits because of application are around 64 billion stored objects.
Currently investigating another solution because we are starting to get worried about that limit
roidrage # arg: may i propose putting this discussion in the next recap? very valueable.
arg # roidrage: absolutely
johne : if you spread that over several nodes you can probably fit the bitcask metadata in ram
or you could use innostore
but we're also actively working on a compressed in-memory keydir format that will be much smaller than 32bytes / key
arg # using burst tries : http://portal.acm.org/citation.cfm?id=506312
benblack # http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499
johne # When you say spread that over several nodes, are you meaning split the keys?
benblack # for those without acm logins
arg # riak will do that for you
benblack # you should clarify what "that" is
arg # the more nodes you have the less per-node bitcask overhead there is
benblack: replica placement
benblack # i know
johne # got it... I misunderstood.
benblack # see :P
johne # So, if I understand correctly, riak could handle a larger number of
keys than what the backend supports. Is that correct?
arg # yeah, by spreading the keys over a number of hosts
johne # yes, very good. Any known limits there?
arg # as far as number of hosts?
johne # No, max keys is it just a factor of the number of nodes?
arg # for bitcask yes
johne # great
arg # if you use innostore there's no per-key overhead but its a bit slower than bitcask
johne # Aside from the performance aspect, is the only other limit of
innostore the open file handles?
arg # it caches file handles in an LRU cache, and you can configure the max open handles
innostore can also take much longer to recover from a crash than bitcask
johne # If I take the rough numbers you have provided me here, could I reliably predict number of nodes needed for the desired number of keys?
arg # i think so, yes. also remember that each item is stored on 3 physical nodes (by default), so take that into account
johne # yes, got that. And also that I should stick to bucket defaults, which I think works for me.
arg # it seems to me that if you wanted to deploy this right now, innostore is probably what you want if you have 64 billion objects
johne # Starting to reach those limits... What happens when a particular
node is "full" can additional nodes be added and the re-balancing 'fixes' it so to speak?
arg # yep, you can always add new nodes and they will take over a
fair share of they keyspace
johne # ok
arg # gotta run out for ~10mins but ill be back
johne # Thanks a lot for the help. I think I have enough to work with for the moment. I am trying to
ensue what we are attempting to do would be possible before attempting to structure the data.
arg # always feel free to email email@example.com or the riak-users list if you have more questions
johne # Thanks again
arg # any time!