-
-
Save PharkMillups/403923 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
duiod # Are there any published benchmarks at all? of riak, in general (ideally with small objects). | |
benblack # there are, i believe, the basho folks would be the right ones to ask when they are around | |
duiod # Cool, I'll stick around. | |
benblack # http://twitter.com/dizzyco/status/13014285189 | |
benblack # http://pl.atyp.us/wordpress/?p=2868 | |
duiod # Yes, I read those already, but curious how much overhead riak itself adds to that. | |
arg # hi. someone was looking for perf numbers? | |
duiod # arg: that'd be me. | |
arg # we're releasing our benchmarking tools publicly soon | |
duiod # re: the perf numbers. I'd do some tests myself, but, there's no C/C++ client, so I'd haveto write something that used the protocol buffers stuff. So, trying to avoid that by just asking for numbers here. | |
arg # duiod: c protobufs api coming soon. what size objects and what read/write ratio? | |
duiod # each object I'm writing is ~1.5KB. Reads are uniformly random, I've modeled my data with Cassandra's data model atm (i.e, supercolumns), but, essentially it's something like: 40% write, 50% re-write, 10% read. | |
and by re-write, I mean add something to an existing key. | |
arg # ok what kinda boxes? | |
i'd suggest asking on the riak-users list so you reach the whole dev team and they can probably do a better benchmark for you | |
duiod # 5-10 7200 2TB disks per box, 3 boxes to start with. dual quad 5520s, 24G RAM. | |
arg # well i can do about 1500 1.5k writes/sec against a single node on my mac pro | |
thru the (slow) python protobuffs client | |
duiod # according to some rough napkin math, it works out to around ~2-300GB/data day, with all the indexes and such. | |
arg # reads are less expensive in riak than cassandra | |
duiod # which storage engine? bitcask or inno? | |
arg # bitcask | |
arg # i'd suggest using bitcask for new stuff theres a few rough edges still but it's going to be the default soon | |
duiod # 1500 seems quite slow, I assume that's limited by the client, not the server process? | |
arg # well thats with fsync() after each write | |
duiod # ah | |
arg # and thru a slow python protobufs library on a macbook and its also doing an extra get() on each write by default it gets the object back that it just put so you get any updates that may have happened concurrently with your write | |
let me turn that off and see what happens obviously these are not real benchmarks | |
duiod # Right, I'm just looking for some rough numbers to compare against cassandra right now. Primarily because of vnodes with riak Which I think will cause me far less pain in the long run.. | |
arg # what do you plan to do with the data i mean query access is it mostly by key? or do you need range-requests or secondary indices | |
duiod # Yes, all by key. Each key has all the keys of other data that might be associated with it, etc. And data is written out to multiple keys. A single read probably triggers 20-50 other random reads, but its all point queries by key in other CFs. | |
arg # yeah you can simulate that kinda stuff in riak by using the same key in different buckets | |
to store the associations | |
or use links | |
duiod # yea, I'm still getting familiar with the terms Is there anything like a supercolumn(i.e, data stored contiguously + insert new cells without replacing the entire structure) within a key, with riak? or are links the way to go for that? (as links sound like they'd be less performant, as "links" aren't stored contiguously) | |
arg # no "upsert" functionality you have to write a new doc but you can break doc up into linked docs or docs with same key in different buckets to handle stuff that needs to be updated separately | |
duiod # i see, that's a big downside for me :( |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment