Skip to content

Instantly share code, notes, and snippets.

@loe
Created April 18, 2011 01:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save loe/924658 to your computer and use it in GitHub Desktop.
Save loe/924658 to your computer and use it in GitHub Desktop.
INSERT into t (id, counter) values (1, 1);
-> Statement replicated.
UPDATE t SET counter = 2 WHERE id = 1;
-> Statement replicated.
UPDATE t SET counter = 3 WHERE id = 1;
==> CRASH before replication, but still fsync'd to disk.
==> Failover to slave, application tries the transaction again.
UPDATE t SET counter = 3 where id = 1;
UPDATE t SET counter = 4 where id = 1;
==> Master recovers.
Now you are screwed. If you make the old master a slave of the new, you have no way of controlling which statement is run first. It will write 3 to the counter column for record 1. It is even worse if it isn't SET counter = 4 but something like SET counter = counter + 1 which pretty much guarantees screwed up data. Because the records are not versioned, you just don't know which one is the 'right' answer without manual intervention. To do this automatically at the database & application level you need something like timestamps (Cassandra) or vector clocks (Riak).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment