PharkMillups/gist:790682

## gistfile1.txt
17:27 <echosystm> hi guys

17:27 <echosystm> i just read that "A Riak cluster is generally run on a set of
well-connected physical hosts"

17:27 <echosystm> is it unsuitable to be run on poorly connected physical hosts?

17:29 <echosystm> ie. hosts in different datacentres

17:30 <echosystm> how does it avoid split brain problems?

17:30 <aphyr> echosystm: They sell a replication system for relaying state
between clusters.

17:31 <aphyr> That's intended for use between datacenters.

17:31 <aphyr> Though if you use allow_mult cleverly, it's definitely possible
to solve the paritioning problem.

17:31 <echosystm> well, basically all i want is a step up from log shipping
between two sql databases

17:32 <echosystm> and i dont mind using a document database, because they all
seem to be better at distribution

17:32 <aphyr> You might also look into couchdb

17:32 <echosystm> think two databases, synchronous writes and automatic failover

17:32 <echosystm> i imagine for the failover, there would need to be a third
node or a witness of some kind?

17:32 <aphyr> Synchronous writes *across* the datacenter boundary?

17:32 <echosystm> yep

17:32 <aphyr> Prepare for slowness!

17:33 <echosystm> performance isnt an issue

17:33 <aphyr> I would definitely recommend couch

17:34 <echosystm> would that be more suitable for this use case than riak?

17:34 <aphyr> I don't really understand your use case fully

17:34 <aphyr> In Riak there is no privileged node

17:34 <aphyr> Hence no failover

17:34 <aphyr> Nodes just join and leave the cluster and rearrange data to
compensate.

17:34 <echosystm> well, i want it to be active-active, so there is no 'master'

17:35 <aphyr> OK. For Riak you typically run 4+ nodes in a cluster

17:35 <echosystm> when i say failover, what i really mean is ensuring that if
connectivity is lost to a database, that database knows to shut itself down

17:35 <aphyr> You mean that when a client disconnects the DB should shut down?

17:35 <aphyr> That doesn't sound like failover.

17:35 <echosystm> no

17:36 <echosystm> imagine you have two databases

17:36 <echosystm> the link goes down between them

17:36 <echosystm> who keeps running and who shuts down?

17:36 <aphyr> Both keep running.

17:36 <echosystm> thats not acceptible

17:36 <aphyr> I suppose you could kill every one of them.

17:36 <aphyr> But really dude, how do you expect to choose a privileged master
without destroying service?

17:37 <echosystm> i dont

17:37 <aphyr> Let me suggest an example of how Riak handles partitioning and you
can see if it applies to you.

17:37 <aphyr> There are four nodes in a cluster.

17:37 <echosystm> there would need to be a third node to help reach some
consensus on which node should turn
off

17:37 <aphyr> A partition occurs and splits the cluster into two 2-node segments.

17:38 <echosystm> ie. the node should turn itself off if it cant reach the other
two or if it is specifically told to by another node

17:38 <aphyr> Those nodes continue to serve requests, both reads and writes,
 as normal.

17:38 <aphyr> All data is accesible in both partitions assuming your durability
 parameters are tuned correctly.

17:38 <aphyr> When the partition ends the nodes rejoin each other.

17:38 <aphyr> They then resolve conflicts in one of two ways

17:38 <aphyr> 1. By last-write wins

17:38 <aphyr> 2. By allow-mult.

17:39 <aphyr> In the case of allow-mult, all written versions are stored,
 and returned to the client on read.

17:39 <aphyr> The client is then responsible for negotiating the merge.

17:39 <echosystm> yeah, that is too complicated

17:39 <aphyr> Riak is designed for high availability.

17:39 <echosystm> if you get a split brain problem like that, i just want one shut down

17:39 <aphyr> You can, if you like, devise a system to do that.

17:40 <aphyr> But consider, first, that on short time scales this is
*always* occuring in a cluster.

17:40 <jdmaturen> there is no way to stop nodes from crashing and networks
from partitioning

17:41 <aphyr> It sounds like you might be more interested in a synchronous
database with directed replication
to a hot standby.

17:41 <echosystm> probably

17:41 <aphyr> In which case couchdb or any of the big RDBMS's might be good candidates.

17:42 <aphyr> Riak is aimed more at high availability; it sounds like you
actually want your system to fail.

17:42 <aphyr> You might also look into using something like Heartbeat to
handle your failover.

17:42 <echosystm> well, i dont want the state of the application to get all
messed up because it has been
partitioned

17:43 <aphyr> You have two realistic choices: more complicated algorithms to

handle concurrent modification/paritioning or having the cluster fail.

17:43 <echosystm> unless i'm missing something, clients on partition A are
going to be doing all kinds of things based on what they see there, while clients
on partition B are doing that also

17:44 <echosystm> when you merge the partitions back together, its all going to get messedip

17:44 <aphyr> That is really an application problem.

17:44 <echosystm> *messed up

17:44 <aphyr> I'm building a nontrivial system in Riak right now; concurrent
writes and partitions are a part of my test suite.

17:44 <aphyr> It's definitely possible to handle.

17:44 <jdmaturen> http://blog.basho.com/2010/01/29/why-vector-clocks-are-easy/ may be of use

17:46 <aphyr> There are some situations for which vector clock merges as in
Riak are unweildy; unique ID generation being one of them. I think most
people have found it worthwhile to use a hybrid approach where Riak handles
their mergeable persistent data, and some small locking service handles synchronization.

17:46 <echosystm> i think this is all far too overkill for my purposes

17:47 <aphyr> Probably. Tell you what: go look at the couchdb replication docs. If
that's not what you like, and mysql hot standbys aren't either, take a look at vector clocks.

17:47 <echosystm> ok

17:47 <echosystm> will

17:47 <echosystm> *will do

17:47 <echosystm> thanks for your help
	17:27 <echosystm> hi guys

	17:27 <echosystm> i just read that "A Riak cluster is generally run on a set of
	well-connected physical hosts"

	17:27 <echosystm> is it unsuitable to be run on poorly connected physical hosts?

	17:29 <echosystm> ie. hosts in different datacentres

	17:30 <echosystm> how does it avoid split brain problems?

	17:30 <aphyr> echosystm: They sell a replication system for relaying state
	between clusters.

	17:31 <aphyr> That's intended for use between datacenters.

	17:31 <aphyr> Though if you use allow_mult cleverly, it's definitely possible
	to solve the paritioning problem.

	17:31 <echosystm> well, basically all i want is a step up from log shipping
	between two sql databases

	17:32 <echosystm> and i dont mind using a document database, because they all
	seem to be better at distribution

	17:32 <aphyr> You might also look into couchdb

	17:32 <echosystm> think two databases, synchronous writes and automatic failover

	17:32 <echosystm> i imagine for the failover, there would need to be a third
	node or a witness of some kind?

	17:32 <aphyr> Synchronous writes across the datacenter boundary?

	17:32 <echosystm> yep

	17:32 <aphyr> Prepare for slowness!

	17:33 <echosystm> performance isnt an issue

	17:33 <aphyr> I would definitely recommend couch

	17:34 <echosystm> would that be more suitable for this use case than riak?

	17:34 <aphyr> I don't really understand your use case fully

	17:34 <aphyr> In Riak there is no privileged node

	17:34 <aphyr> Hence no failover

	17:34 <aphyr> Nodes just join and leave the cluster and rearrange data to
	compensate.

	17:34 <echosystm> well, i want it to be active-active, so there is no 'master'

	17:35 <aphyr> OK. For Riak you typically run 4+ nodes in a cluster

	17:35 <echosystm> when i say failover, what i really mean is ensuring that if
	connectivity is lost to a database, that database knows to shut itself down

	17:35 <aphyr> You mean that when a client disconnects the DB should shut down?

	17:35 <aphyr> That doesn't sound like failover.

	17:35 <echosystm> no

	17:36 <echosystm> imagine you have two databases

	17:36 <echosystm> the link goes down between them

	17:36 <echosystm> who keeps running and who shuts down?

	17:36 <aphyr> Both keep running.

	17:36 <echosystm> thats not acceptible

	17:36 <aphyr> I suppose you could kill every one of them.

	17:36 <aphyr> But really dude, how do you expect to choose a privileged master
	without destroying service?

	17:37 <echosystm> i dont

	17:37 <aphyr> Let me suggest an example of how Riak handles partitioning and you
	can see if it applies to you.

	17:37 <aphyr> There are four nodes in a cluster.

	17:37 <echosystm> there would need to be a third node to help reach some
	consensus on which node should turn
	off

	17:37 <aphyr> A partition occurs and splits the cluster into two 2-node segments.

	17:38 <echosystm> ie. the node should turn itself off if it cant reach the other
	two or if it is specifically told to by another node

	17:38 <aphyr> Those nodes continue to serve requests, both reads and writes,
	as normal.

	17:38 <aphyr> All data is accesible in both partitions assuming your durability
	parameters are tuned correctly.

	17:38 <aphyr> When the partition ends the nodes rejoin each other.

	17:38 <aphyr> They then resolve conflicts in one of two ways

	17:38 <aphyr> 1. By last-write wins

	17:38 <aphyr> 2. By allow-mult.

	17:39 <aphyr> In the case of allow-mult, all written versions are stored,
	and returned to the client on read.

	17:39 <aphyr> The client is then responsible for negotiating the merge.

	17:39 <echosystm> yeah, that is too complicated

	17:39 <aphyr> Riak is designed for high availability.

	17:39 <echosystm> if you get a split brain problem like that, i just want one shut down

	17:39 <aphyr> You can, if you like, devise a system to do that.

	17:40 <aphyr> But consider, first, that on short time scales this is
	always occuring in a cluster.

	17:40 <jdmaturen> there is no way to stop nodes from crashing and networks
	from partitioning

	17:41 <aphyr> It sounds like you might be more interested in a synchronous
	database with directed replication
	to a hot standby.

	17:41 <echosystm> probably

	17:41 <aphyr> In which case couchdb or any of the big RDBMS's might be good candidates.

	17:42 <aphyr> Riak is aimed more at high availability; it sounds like you
	actually want your system to fail.

	17:42 <aphyr> You might also look into using something like Heartbeat to
	handle your failover.

	17:42 <echosystm> well, i dont want the state of the application to get all
	messed up because it has been
	partitioned

	17:43 <aphyr> You have two realistic choices: more complicated algorithms to

	handle concurrent modification/paritioning or having the cluster fail.

	17:43 <echosystm> unless i'm missing something, clients on partition A are
	going to be doing all kinds of things based on what they see there, while clients
	on partition B are doing that also

	17:44 <echosystm> when you merge the partitions back together, its all going to get messedip

	17:44 <aphyr> That is really an application problem.

	17:44 <echosystm> *messed up

	17:44 <aphyr> I'm building a nontrivial system in Riak right now; concurrent
	writes and partitions are a part of my test suite.

	17:44 <aphyr> It's definitely possible to handle.

	17:44 <jdmaturen> http://blog.basho.com/2010/01/29/why-vector-clocks-are-easy/ may be of use

	17:46 <aphyr> There are some situations for which vector clock merges as in
	Riak are unweildy; unique ID generation being one of them. I think most
	people have found it worthwhile to use a hybrid approach where Riak handles
	their mergeable persistent data, and some small locking service handles synchronization.

	17:46 <echosystm> i think this is all far too overkill for my purposes

	17:47 <aphyr> Probably. Tell you what: go look at the couchdb replication docs. If
	that's not what you like, and mysql hot standbys aren't either, take a look at vector clocks.

	17:47 <echosystm> ok

	17:47 <echosystm> will

	17:47 <echosystm> *will do

	17:47 <echosystm> thanks for your help