Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created September 15, 2010 17:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PharkMillups/581141 to your computer and use it in GitHub Desktop.
Save PharkMillups/581141 to your computer and use it in GitHub Desktop.
14:28 <j2043> With Riak 0.12 Is there an upper limit on the number of keys in a bucket?
14:29 <bingeldac> memory
14:30 <bingeldac> but that is total across all buckets
14:30 <benblack> the question suggests you may be doing it wrong ;)
14:30 <bingeldac> bucket is more of a name space and is "virtual"
14:30 <bingeldac> but yes, what benblack could be the case too :)
14:30 <technoweenie> doing what wrong?
14:31 <bingeldac> it is only when you specify particular attributes outside the default (like N,R, W), that
the system handles that bucket differently
14:31 <benblack> can i stick an essentially infinite number of keys in a bucket? -> probably another problem.
14:31 <bingeldac> that being said there is no limit
14:31 <benblack> there is a significant performance penalty when you exceed available memory, as bingeldac said
14:32 <j2043> Cool. I have about a 100 million keys that would logically go in one bucket.
14:32 <benblack> you mean in your current (rdbms?) system they are logically in a single bucket?
14:32 <j2043> yeah.
14:32 <benblack> not generally the best way to use riak.
14:33 <benblack> and even with very small keys, 100M of them is a lot of memory.
14:34 <j2043> I can split them up into different buckets on something arbitrary, but they are all
of the same type.
14:34 <bingeldac> depends on how you access them really
14:34 <j2043> So if this was a censous, they would all be Smiths
14:34 <bingeldac> say you wanted to do a lot of heavy date range queries
14:34 <benblack> what are the access patterns?
14:35 <bingeldac> then maybe you separate them into date buckets
14:35 <j2043> Pure key value at the moment
14:35 <j2043> When i want to do more complex queries I will break them up
14:39 <j2043> benblack: when you say "there is a significant performance penalty when you exceed
available memory" do you mean my entire riak dataset should fit in ram?
14:40 <benblack> no, as we said: keys
14:40 <benblack> bitcask keeps an in-memory index of keys
14:41 <j2043> Ah, good
14:51 <j2043> Is there somewhere where I can read up on the bitcask memory stuff? Right now I have a
table with about ~170 million rows in it. If I were to make each of those rows an individual item in
riak thats good chunk of memory.
14:51 <benblack> the blog posts on bitcask, the bitcask docs included with the code, the code itself
14:52 <j2043> Great, thanks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment