Created
January 10, 2012 06:46
-
-
Save ryankennedy/1587490 to your computer and use it in GitHub Desktop.
Scalable Application Specific Databases with Berkeley DB Java Edition
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Title: Scalable Application Specific Databases with Berkeley DB Java Edition | |
Short Description (400 chars): In 2011 Yammer replaced a creaking 10B row messaging | |
database with BDB JE. This improved our availability, simplified scaling and lowered | |
delivery latency. I will discuss evaluating whether BDB JE is right for your situation, | |
problems you may run into with along the way and useful patterns for leveraging BDB | |
JE as an embedded solution to building reliable application specific databases at scale. | |
Full Description (a few paragraphs): | |
Most people think of Berkeley DB as a simple, on disk, B-tree based key/value store. | |
Few realize there's a Java version. Fewer still realize there are reliable replication | |
and leader election functions built in. These capabilities make it possible to implement | |
a simple, reliable and scalable embedded application specific database. Oracle | |
themselves recently released their Oracle NoSQL Database, which is built on top of | |
Berkeley DB Java Edition (BDB JE). | |
In 2011, after ruling out other technologies and hardware upgrades, Yammer replaced a | |
creaking 10 billion row PostgreSQL messaging database with a home grown distributed | |
database cluster built atop BDB JE. In the process we improved system availability, | |
simplified scaling and lowered message delivery latency. Our operations team sleeps better | |
at night knowing we have N+1 redundancy. | |
We chose an embedded solution after evaluating several client/server stores. In the end, | |
our high write fanout on message delivery (imagine an SMTP server attempting to deliver 1 | |
email message with 50,000 addresses in the Cc header) made it increasingly expensive to | |
perform over the network. An embedded solution enables this fanout to occur in-process with | |
writes dispatched to a local filesystem. | |
I will discuss evaluating whether BDB JE is right for your situation, problems you may run | |
into along the way and useful patterns for leveraging BDB JE as an embedded solution to | |
building reliable application specific databases at scale. While familiarity with b-trees, | |
replication and leader election will be helpful, none of what I discuss will dive into | |
territory that requires extensive knowledge of databases or distributed computing. |
I look forward to (hopefully) giving it.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Very cool talk. I look forward to hearing about it.