Skip to content

Instantly share code, notes, and snippets.

@PharkMillups
Created September 15, 2010 17:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PharkMillups/581081 to your computer and use it in GitHub Desktop.
Save PharkMillups/581081 to your computer and use it in GitHub Desktop.
12:28 <grantr> anyone working on a riak storage driver for rabbitmq?
12:28 <grantr> seems like a natural fit
12:29 <seancribbs> grantr: unless rabbit expects atomic operations
12:29 <grantr> oh hmm, it might. it does have transactional semantics
12:31 <grantr> seancribbs, the rabbitmq guys suggested kevin smith might have done some work in that vein
12:31 <seancribbs> yes, he has
12:31 <seancribbs> we have been improving map-reduce for some customers who want to crunch large datasets
12:31 <seancribbs> I remember someone mentioning > 100K keys, so I know it's possible
12:32 <seancribbs> oh wait
12:32 <* seancribbs> fail
12:32 <grantr> lol
12:32 <seancribbs> replying to you about an email on the mailing list
12:32 <grantr> different thread?
12:32 <seancribbs> no, kevin was looking at using Riak to distribute Rabbit, not store its messages
12:32 <grantr> ohhhh
12:32 <seancribbs> but that's LONG ago
12:32 <seancribbs> i.e. > 3 months
12:33 <grantr> rabbit is already clustered
12:33 <grantr> but i guess you could rewrite it with riak_core
12:33 <seancribbs> rabbit's clustering is…. interesting
12:33 <grantr> hmm
12:34 <grantr> i have no experience with it, but it seems like it SHOULD work
12:34 <seancribbs> for some values of SHOULD
12:34 <grantr> the only big hole that i can see is that queues are not replicated
12:34 <seancribbs> that's what this would do IIRC
12:34 <grantr> ah
12:36 <grantr> i know distributed queues are hard and all, but it seems like they would be a common problem. someone should have solved it by now (and open sourced the solution)
12:36 <seancribbs> you could do it well with chain replication
12:36 <seancribbs> like hibari's
12:37 <grantr> not familiar with hibari, looking it up now
12:37 <seancribbs> very new - in production though at a .jp telco
12:37 <grantr> chain replication, just from the name, sounds similar to the way hdfs does replication
12:39 <seancribbs> chain repl is strongly consistent
12:39 <seancribbs> because you only read from the end of the chain
12:39 <grantr> oh ok
12:39 <grantr> and if the end of the chain goes away, theres a new end
12:39 <grantr> interesting
12:40 <seancribbs> yes, and you are guaranteed to get only values that all replicas have
12:40 <grantr> but you have to write to the front of the chain
12:40 <seancribbs> yes
12:40 <grantr> sounds like a scalability issue
12:40 <seancribbs> so, there is some lag-time
12:40 <seancribbs> mmm, no, because you chash items to specific chains
12:41 <seancribbs> it's just a different model
12:41 <grantr> ah, so theres more than one chain
12:41 <seancribbs> yes
12:41 <seancribbs> go to CUFP in Baltimore, Scott Fritchie will be talking about it
12:41 <grantr> thats cool
12:41 <seancribbs> ;)
12:41 <grantr> bit of a trek from seattle :)
12:41 <seancribbs> ah
12:41 <benblack> grantr: you can have strong consistency or you can have lots of
scalability...both at once is harder.
12:42 <seancribbs> benblack: actually the challenge with chain repl is recovering from failure
12:42 <benblack> that is one of the challenges, yes
12:42 <seancribbs> replaying writes, etc
12:43 <grantr> i guess with chains each of the nodes is special?
12:43 <seancribbs> there were at least 5 different scenarios Scott detailed to us
12:43 <seancribbs> and I don't think that's all of them
12:43 <benblack> it's a cool architecture, just has constraints, as with anything else
12:43 <seancribbs> true dat
12:44 <seancribbs> grantr: each "brick" (physical node) has a different portion of the chains
12:44 <seancribbs> sometimes the head, sometimes the middle, sometimes the tail
12:44 <benblack> grantr: like they say in the incredibles, every node being special is
another way of saying no node is special
12:44 <seancribbs> Every node is sacred, every node is good.
12:44 <seancribbs> oh wait that's monty python
12:44 <benblack> is there a monty python joke in here?
12:44 <benblack> heh
12:45 <grantr> has anyone actually used hibari besides the jp telco?
12:45 <seancribbs> it's very new
12:45 <benblack> not to my knowledge
12:45 <seancribbs> http://www.slideshare.net/geminimobile/hibari
12:45 <benblack> grantr: the author of hibari is now at basho
12:45 <grantr> oh cool
12:45 <seancribbs> (one of the authors)
12:45 <benblack> sorry, one of :)
12:45 <benblack> the only one i know personally!
12:46 <seancribbs> true
12:46 <seancribbs> likewise
12:46 <benblack> btw, did you take care of that neotoma chicken and rebar problem?
12:46 <grantr> so many distributed stores available today
12:46 <grantr> which to choose
12:47 <seancribbs> benblack: no, i'm slogging through some other issues
12:47 <seancribbs> send me a PR with the neutered rebar
12:47 <seancribbs> or heck, I'll give you commit rights if you want
12:47 <benblack> whichever you prefer
12:48 <seancribbs> you're "b" on gh, right?
12:49 <benblack> yes
12:49 <seancribbs> boom. done
12:50 <benblack> well ok then
13:05 <grantr> does anyone here have experience using clustered rabbitmq? was it workable?
13:06 <benblack> yes, and no.
13:08 <grantr> damn
13:08 <grantr> are there any message buses out there that scale?
13:10 <seancribbs> /dev/null
13:10 <seancribbs> ^^
13:10 <jdmaturen> ones that don't require absolute ordering, i.e. write to random node, read from random
node, no cross communication => scale horizontally
13:14 <grantr> i'll have to think whether absolute ordering is a requirement
13:15 <jdmaturen> scale = idempotent and commutative operations
13:15 <jdmaturen> [as much as humanly possible]
13:16 <grantr> idempotent meaning you can do the same thing twice with the same result, commutative
meaning you can do the same thing in a different order with the same result?
13:16 <seancribbs> jdmaturen: that's a gross overgeneralization
13:16 <jdmaturen> seancribbs: sorry, as regards message queues
13:17 <jdmaturen> #riak is certainly not the place to make the assertion i did
13:17 <* seancribbs> excuses himself because he's in a bad mood
13:17 <jdmaturen> grantr: yea
13:17 <jdmaturen> thus you can handle dupe messages + messages out of order
13:18 <grantr> theres a system called beetle that xing uses, they push to multiple queues at once and use
redis to dedup
13:18 <grantr> also, kestrel is much like you describe
13:19 <jdmaturen> yes, i have an affinity for it
13:19 <grantr> kestrel?
13:19 <jdmaturen> yea
13:19 <grantr> any experience with it?
13:19 <jdmaturen> yes
13:19 <jdmaturen> not much experience with AMQP type stuff though
13:19 <jdmaturen> e.g. rabbit
13:20 <jdmaturen> so I may not be able to compare/contrast well
13:20 <johnae> there's also nanite
13:20 <seancribbs> nanite uses rabbit
13:21 <seancribbs> (not necessarily clustered)
13:23 <grantr> so kestrel has each queue only on one machine?
13:23 <grantr> or are the messages in each queue hashed to different machines
13:24 <jdmaturen> in my setup each node in a "cluster" has every queue
13:25 <jdmaturen> no hashing, just pick a random node to write to -- perhaps weighted by network distance
13:26 <grantr> is there any mechanism to ensure messages dont wait around on a machine forever?
13:26 <jdmaturen> readers use /open and /close for reliable delivery, read from a random node for X
amount of time / reads, then randomly connect to a new node
13:26 <jdmaturen> grantr: measuring age / # of items on each queue on each node
13:26 <grantr> ok so you get N messages from each node, so over time all messages will be consumed
13:26 <grantr> ok, so clients need a bit of smartness to them
13:26 <jdmaturen> yes, and I have many many more workers than queues / nodes
13:27 <grantr> that seems like it could work reasonably well
13:27 <jdmaturen> works for me®
13:27 <grantr> i wonder if you needed strong ordering, you could add that on top
13:27 <grantr> maybe use redis as a dedup store
13:28 <grantr> well more like an order cache
13:29 <grantr> does kestrel support exchanges or pub/sub?
13:29 <jdmaturen> no
13:29 <grantr> thats one nice thing about rabbitmq, you can push to an exchange and it may be
delivered to multiple endpoints
13:30 <seancribbs> you mean a fanout exchange
13:31 <grantr> right
13:31 <grantr> sorry pubsub not really the right term
13:34 <grantr> really what i want is distributed seda
13:34 <grantr> services publish events, other services consume those events, but they are loosely coupled
13:35 <grantr> and multiple services may be interested in consuming a particular event
13:36 <* jdmaturen> waves hand at 0mq
13:37 <grantr> 0mq is very cool but seems pretty low level. i should do more reading about it
13:37 <jdmaturen> agreed
13:41 <seancribbs> grantr: sounds like you are talking about Bonjour or SNMP
13:41 <seancribbs> IP Multicast
13:43 <grantr> seancribbs, ip multicast doesnt work everywhere (ec2), but could be worth looking into also
13:44 <grantr> i was hoping for something higher level though
13:44 <jdmaturen> http://akkasource.org/ ?
13:45 <jdmaturen> [again with the hand-waving]
13:46 <grantr> akka also cool! another one i need to look into more
13:49 <benblack> there is a lot of akka. i don't know if that makes it cool.
13:50 <benblack> what you are describing is exactly what rabbit and 0mq do, though they do it in very
different ways.
13:55 <grantr> benblack, if rabbitmq were stable at scale and highly available, it would be perfect.
it doesnt sound like it is though
13:56 <benblack> it requires care and feeding, for sure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment