PharkMillups/gist:581141

## gistfile1.txt
14:28 <j2043> With Riak 0.12 Is there an upper limit on the number of keys in a bucket?

14:29 <bingeldac> memory

14:30 <bingeldac> but that is total across all buckets

14:30 <benblack> the question suggests you may be doing it wrong ;)

14:30 <bingeldac> bucket is more of a name space and is "virtual"

14:30 <bingeldac> but yes, what benblack could be the case too :)

14:30 <technoweenie> doing what wrong?

14:31 <bingeldac> it is only when you specify particular attributes outside the default (like N,R, W), that
the system handles that bucket differently

14:31 <benblack> can i stick an essentially infinite number of keys in a bucket? -> probably another problem.

14:31 <bingeldac> that being said there is no limit

14:31 <benblack> there is a significant performance penalty when you exceed available memory, as bingeldac said

14:32 <j2043> Cool. I have about a 100 million keys that would logically go in one bucket.

14:32 <benblack> you mean in your current (rdbms?) system they are logically in a single bucket?

14:32 <j2043> yeah.

14:32 <benblack> not generally the best way to use riak.

14:33 <benblack> and even with very small keys, 100M of them is a lot of memory.

14:34 <j2043> I can split them up into different buckets on something arbitrary, but they are all
of the same type.

14:34 <bingeldac> depends on how you access them really

14:34 <j2043> So if this was a censous, they would all be Smiths

14:34 <bingeldac> say you wanted to do a lot of heavy date range queries

14:34 <benblack> what are the access patterns?

14:35 <bingeldac> then maybe you separate them into date buckets

14:35 <j2043> Pure key value at the moment

14:35 <j2043> When i want to do more complex queries I will break them up

14:39 <j2043> benblack: when you say "there is a significant performance penalty when you exceed
available memory" do you mean my entire riak dataset should fit in ram?

14:40 <benblack> no, as we said: keys

14:40 <benblack> bitcask keeps an in-memory index of keys

14:41 <j2043> Ah, good

14:51 <j2043> Is there somewhere where I can read up on the bitcask memory stuff? Right now I have a
table with about ~170 million rows in it. If I were to make each of those rows an individual item in
riak thats good chunk of memory.

14:51 <benblack> the blog posts on bitcask, the bitcask docs included with the code, the code itself

14:52 <j2043> Great, thanks
	14:28 <j2043> With Riak 0.12 Is there an upper limit on the number of keys in a bucket?

	14:29 <bingeldac> memory

	14:30 <bingeldac> but that is total across all buckets

	14:30 <benblack> the question suggests you may be doing it wrong ;)

	14:30 <bingeldac> bucket is more of a name space and is "virtual"

	14:30 <bingeldac> but yes, what benblack could be the case too :)

	14:30 <technoweenie> doing what wrong?

	14:31 <bingeldac> it is only when you specify particular attributes outside the default (like N,R, W), that
	the system handles that bucket differently

	14:31 <benblack> can i stick an essentially infinite number of keys in a bucket? -> probably another problem.

	14:31 <bingeldac> that being said there is no limit

	14:31 <benblack> there is a significant performance penalty when you exceed available memory, as bingeldac said

	14:32 <j2043> Cool. I have about a 100 million keys that would logically go in one bucket.

	14:32 <benblack> you mean in your current (rdbms?) system they are logically in a single bucket?

	14:32 <j2043> yeah.

	14:32 <benblack> not generally the best way to use riak.

	14:33 <benblack> and even with very small keys, 100M of them is a lot of memory.

	14:34 <j2043> I can split them up into different buckets on something arbitrary, but they are all
	of the same type.

	14:34 <bingeldac> depends on how you access them really

	14:34 <j2043> So if this was a censous, they would all be Smiths

	14:34 <bingeldac> say you wanted to do a lot of heavy date range queries

	14:34 <benblack> what are the access patterns?

	14:35 <bingeldac> then maybe you separate them into date buckets

	14:35 <j2043> Pure key value at the moment

	14:35 <j2043> When i want to do more complex queries I will break them up

	14:39 <j2043> benblack: when you say "there is a significant performance penalty when you exceed
	available memory" do you mean my entire riak dataset should fit in ram?

	14:40 <benblack> no, as we said: keys

	14:40 <benblack> bitcask keeps an in-memory index of keys

	14:41 <j2043> Ah, good

	14:51 <j2043> Is there somewhere where I can read up on the bitcask memory stuff? Right now I have a
	table with about ~170 million rows in it. If I were to make each of those rows an individual item in
	riak thats good chunk of memory.

	14:51 <benblack> the blog posts on bitcask, the bitcask docs included with the code, the code itself

	14:52 <j2043> Great, thanks