Embed URL


SSH clone URL

You can clone with HTTPS or SSH.

Download Gist
View 13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
johne # I have some questions about limits and was hoping someone could help
me out. If I understand correctly, riak does not impose a limit on the number
of buckets, but the chosen backstore may impose some practical limits. Is this correct?
arg # pretty much, yes
johne # There appears to be a similar situation with the number of keys.
arg # innostore currently creates a file per bucket/partition combo
but all other backends use one file per partition unless you really want
innostore, we recommend you use bitcask one other thing with buckets: buckets
dont consume any resources as long as they use the bucket defaults - either
the stock riak defaults or ones you set in your app.config buckets that
change some of those defaults take up a small amount of space in the ring
data structure that's gossiped around
johne # ok, good to know
arg # number of keys, not so much
johne # does bitcask use only a single active data file for all buckets?
arg # bitcask does keep a small amount of metadata for each key in RAM
it uses a single data file for each partition
many bucket/key pairs can reside in a single partition
johne # It looks like bitcask limits the number of keys by the size of the
keydir, which must be memory resident? That would be key hash + key metadata?
arg # yeah it keeps in memory just key hash + file id + offset
we have some ideas on how to compress that further
johne # How big would that be? what is the size of each field there
arg # i think its around 32 bytes per key 20 bytes hash + 4 bytes fileid + 8 bytes offset
johne # ok, thanks that is very helpful
arg # could be 4 bytes offset, not sure
johne # Any additional overhead for keydir?
arg # some small fixed overhead for the hash data structure
but nothing to worry about if you're doing back-of-the-envelope calculations
all that stuff is in relatively optimized C code
johne # I am hoping to go beyond back-of-the-envelope....
arg # IOW the overhead of the container for that metadata wont push you over the edge if you do calculations based strictly on the per-key overhead
johne # I am looking a migrating an app that currently is using a RDBMS. Current limits because of application are around 64 billion stored objects.
Currently investigating another solution because we are starting to get worried about that limit
roidrage # arg: may i propose putting this discussion in the next recap? very valueable.
arg # roidrage: absolutely
johne : if you spread that over several nodes you can probably fit the bitcask metadata in ram
or you could use innostore
but we're also actively working on a compressed in-memory keydir format that will be much smaller than 32bytes / key
arg # using burst tries :
benblack #
johne # When you say spread that over several nodes, are you meaning split the keys?
benblack # for those without acm logins
arg # riak will do that for you
benblack # you should clarify what "that" is
arg # the more nodes you have the less per-node bitcask overhead there is
benblack: replica placement
benblack # i know
johne # got it... I misunderstood.
benblack # see :P
johne # So, if I understand correctly, riak could handle a larger number of
keys than what the backend supports. Is that correct?
arg # yeah, by spreading the keys over a number of hosts
johne # yes, very good. Any known limits there?
arg # as far as number of hosts?
johne # No, max keys is it just a factor of the number of nodes?
arg # for bitcask yes
johne # great
arg # if you use innostore there's no per-key overhead but its a bit slower than bitcask
johne # Aside from the performance aspect, is the only other limit of
innostore the open file handles?
arg # it caches file handles in an LRU cache, and you can configure the max open handles
innostore can also take much longer to recover from a crash than bitcask
johne # If I take the rough numbers you have provided me here, could I reliably predict number of nodes needed for the desired number of keys?
arg # i think so, yes. also remember that each item is stored on 3 physical nodes (by default), so take that into account
johne # yes, got that. And also that I should stick to bucket defaults, which I think works for me.
arg # it seems to me that if you wanted to deploy this right now, innostore is probably what you want if you have 64 billion objects
johne # Starting to reach those limits... What happens when a particular
node is "full" can additional nodes be added and the re-balancing 'fixes' it so to speak?
arg # yep, you can always add new nodes and they will take over a
fair share of they keyspace
johne # ok
arg # gotta run out for ~10mins but ill be back
johne # Thanks a lot for the help. I think I have enough to work with for the moment. I am trying to
ensue what we are attempting to do would be possible before attempting to structure the data.
arg # always feel free to email or the riak-users list if you have more questions
johne # Thanks again
arg # any time!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.