Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created January 31, 2011 19:44
Show Gist options
  • Select an option

  • Save PharkMillups/804658 to your computer and use it in GitHub Desktop.

Select an option

Save PharkMillups/804658 to your computer and use it in GitHub Desktop.
08:11 <jbrisbin> hey folks...got a question about keys and ordering within buckets
08:12 <jbrisbin> If I add entries to Riak and subsequently fetch the list of
keys later, both to do a count and to find out what's in it, the keys are in a
different order than what they were added in
08:12 <seancribbs> jbrisbin: yes
08:13 <seancribbs> there's no natural ordering
08:13 <jbrisbin> Is there a way to affect this at all? Or is it Just How It Is?
08:13 <seancribbs> it has to do with consistent hashing, so kinda "how it is"
08:14 <jbrisbin> So if I needed to preserve some kind of order of
keys, I'd need to maintain a separate metadata map that has them in the right order?
08:14 <jbrisbin> Or would it be better to attach a sequence ID metadata
element? X-Riak-Meta-SeqId
08:15 <jbrisbin> Then M/R the results and let Javascript's array sorting
handle putting them in order?
08:15 <seancribbs> yes, you would have to use M/R to sort them
08:17 <jbrisbin> I suppose that would involve loading all entries into the
Javascript context via reduce function? Thinking about possible memory usage
problems here with lots of entries...
08:17 <seancribbs> well, are you just trying to get the keys or the objects in-order?
08:17 <jbrisbin> Good question...
08:18 <jbrisbin> I suppose I need the keys in the order they were inserted,
but I need the body to send to the application
08:18 <jbrisbin> Or I need a way to get the oldest entry in a bucket, more specifically
08:18 <jbrisbin> Which might change how the problem is approached
08:20 <jbrisbin> Like popping the oldest entry off the stack...only
the stack is a Riak bucket
08:20 <seancribbs> well then you might be best putting some kind of
indicator in the key, then start with a reduce phase
08:21 <jbrisbin> Could key filters help there?
08:21 <seancribbs> potentially, if you know some way to limit the key list
08:22 <seancribbs> but if your key starts with an ISO8601 timestamp, it'll be easy to sort that
08:22 <seancribbs> then just take the first/minimum one
08:24 <jbrisbin> Will I run into problems if I'm doing this
constantly? e.g. popping each item off the bucket for every request
08:25 <jbrisbin> Maybe I need some kind of hook that updates a key
list for me only when things change?
08:25 <jbrisbin> Or I recalculate what the oldest entry is and store just that id
08:26 <seancribbs> are the objects being removed after you "pop" them off?
08:26 <jbrisbin> When they are "ack"d, yes
08:27 <seancribbs> sounds like you should be using a queue for this.
08:27 <jbrisbin> well, that's what it is :)
08:27 <jbrisbin> the backing store for a queue
08:27 <seancribbs> no, i mean something meant to be a queue
08:27 <seancribbs> oic
08:27 <seancribbs> this is your rabbit stuff
08:27 <jbrisbin> yes
08:27 <seancribbs> ha
08:29 <jbrisbin> I'm trying to figure out the best way to preserve
ordering so that when the next message from the queue is requested,
I give back the oldest one I have (without relying on an internal list
of some kind, which might be lost in a restart)
08:30 <jbrisbin> But it sounds like using timestamps as part of my
key may be a good way to do natural ordering
08:30 <jbrisbin> And maybe the fastest way
08:30 <seancribbs> you'll need microseconds for sure
08:31 <seancribbs> but… you might get slightly out-of-order depending on
how it's put together
08:31 <jbrisbin> was planning on using erlang's now()
08:31 <seancribbs> yeah
09:00 <marksteele> jbrisbin: you could use a key-filter and add a
sequence number to the key
09:02 <jbrisbin> marksteele: you mean Riak would do this on a save?
09:08 <marksteele> jbrisbin: not sure if it can do that on save,
but if each key has a time-based value as part of the key,
using a mapred job you can use key-filters and order on that (i think)
09:11 <jbrisbin> it would be great if there were a max and min key filter ;)
09:12 <jbrisbin> where are they implemented within src/? I might take a poke at it
09:14 <seancribbs> jbrisbin: they are in riak_kv/src/riak_kv_mapred_filter...
09:14 <seancribbs> not sure if it ends in "s"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment