Skip to content

Instantly share code, notes, and snippets.

View PharkMillups's full-sized avatar

Mark Phillips PharkMillups

  • Helium
  • Boston / San Francisco
View GitHub Profile
BBHoss # are mapreduce queries intended to be live or batch? When I get upwards of 10,000 things in a bucket, it takes too long to run the query, like 10 seconds
drev1 # BBHoss: bucket based queries are not recommended in production. the queries work best against a known list of bucket/key pairs
BBHoss # drev1: is that a new feature?
drev1 # which?
BBHoss # querying over bucket-key pairs
florian # hi guys
seancribbs # howdy
florian # i'm an engineer at Twitter and have started to look into Riak a little bit
benblack # florian: do you work with rk?
florian # what's the largest cluster that you guys now that's currently running?
misaka # Hiyas. Have a Q about riak ...How good durable is it in a single node setup? If I use Innostore, can I assume it'll be as durable as, say, MySQL? s/good // Everything I read talks about distributed durability, which makes sense, but my app is going to start off with only one node and I really don't want to lose any data.
seancribbs # misaka: we err on the side of writing to disk. So unless you use a memory-based backend, your data is written reliably. what you don't get in a single-node setup is availability/fault-tolerance since there's only one
misaka # Hey Sean. Thanks for that. That's what I was hoping for.
freshtonic # anyone know where I can find an example of Riak post-commit hook?
seancribbs # freshtonic: as soon as we have one, it'll be on the wiki. what do you want your hook to do?
freshtonic # seancribbs: when I put something in bucket A, I want to create another record in bucket B within the commit hook.
seancribbs # freshtonic: any more specificity than that?
freshtonic # bucket B is a log of change events. Whereas bucket A contains the 'current' document, bucket B contains the history of changes.
ouvasam # Hello Using ripple is there something to delete an embedded document
seancribbs # ouvasm: one or many?
ouvasam # many
]seancribbs # the association proxies an array, so you should be able
to do doc.embedded_docs.delete(embedded_doc)
ouvasam # thanks, but using a html form, how can i retrieve the embedded doc ?
crucially # can I do conflict resolution in post commit hooks?
seancribbs # crucially: i don't think so
crucially # pre commit hooks? i guess I could from erlang
crucially # if you set returnbody and I set the content type to multipart and
there is a conflict, i am supposed to get all siblings?
justinsheehy # you can do some conflict resolution, but you can't easily guarantee
crucially # in bitcask if you do a HEAD request does it need to find the
object on disk and read it in?
seancribbs # crucially: with any of the backends, yes. there's no separation of metadata from object
crucially # ok
seancribbs # so the only savings you get is over the wire to your application
crucially # so storing riakfs metadata in another bucket and store that bucket on SSD
lypanov # Is there any good reason to use Cassandra rather than riak? How does the
raw speed compare? Im having problems finding benchmarks :)
seancribbs # lypanov: it's hard to compare them side-by-side at least, in terms of performance
their access patterns are very different
lypanov # hey seancribbs. just listened to your awesome changelog podcast. thusly my being here :)
seancribbs # cool I didn't say much in that one, at least compared to andy
lypanov # how can one emulate a range query with riak?
another table with links? is that efficient with mapred?
wondering how to do a new uncached mapred on a historic data set efficiently
benblack # going to be a tough one
doc stores are good at a lot of things, but time series is not really in the sweet spot
lypanov # seems to be the ideal use case for cassandra afaict :(
benblack # column stores are better at it (and not as good at a lot of other things)
howboutjoe # I've got a bucket with allow_mutl=true. When I get back a
riak_object with multiple siblings, I've got a function that can merge that
data together the way that I want it. Is getting Riak to understand that I've
taken care of it as simple as riak_object:update_value and riak_client:put?
seancribbs # howboutjoe: yes, with the proper vector clock
howboutjoe # Does the riak_object I get back have one vector clock, or one
for each sibling?