ryankennedy/velocity_2012_bdbje_scaling.txt

## velocity_2012_bdbje_scaling.txt
Title: Scalable Application Specific Databases with Berkeley DB Java Edition

Short Description (400 chars): In 2011 Yammer replaced a creaking 10B row messaging
database with BDB JE. This improved our availability, simplified scaling and lowered
delivery latency. I will discuss evaluating whether BDB JE is right for your situation,
problems you may run into with along the way and useful patterns for leveraging BDB
JE as an embedded solution to building reliable application specific databases at scale.

Full Description (a few paragraphs):

Most people think of Berkeley DB as a simple, on disk, B-tree based key/value store.
Few realize there's a Java version. Fewer still realize there are reliable replication
and leader election functions built in. These capabilities make it possible to implement
a simple, reliable and scalable embedded application specific database. Oracle
themselves recently released their Oracle NoSQL Database, which is built on top of
Berkeley DB Java Edition (BDB JE).

In 2011, after ruling out other technologies and hardware upgrades, Yammer replaced a
creaking 10 billion row PostgreSQL messaging database with a home grown distributed
database cluster built atop BDB JE. In the process we improved system availability,
simplified scaling and lowered message delivery latency. Our operations team sleeps better
at night knowing we have N+1 redundancy.

We chose an embedded solution after evaluating several client/server stores. In the end,
our high write fanout on message delivery (imagine an SMTP server attempting to deliver 1
email message with 50,000 addresses in the Cc header) made it increasingly expensive to
perform over the network. An embedded solution enables this fanout to occur in-process with
writes dispatched to a local filesystem.

I will discuss evaluating whether BDB JE is right for your situation, problems you may run
into along the way and useful patterns for leveraging BDB JE as an embedded solution to
building reliable application specific databases at scale. While familiarity with b-trees,
replication and leader election will be helpful, none of what I discuss will dive into
territory that requires extensive knowledge of databases or distributed computing.
	Title: Scalable Application Specific Databases with Berkeley DB Java Edition

	Short Description (400 chars): In 2011 Yammer replaced a creaking 10B row messaging
	database with BDB JE. This improved our availability, simplified scaling and lowered
	delivery latency. I will discuss evaluating whether BDB JE is right for your situation,
	problems you may run into with along the way and useful patterns for leveraging BDB
	JE as an embedded solution to building reliable application specific databases at scale.

	Full Description (a few paragraphs):

	Most people think of Berkeley DB as a simple, on disk, B-tree based key/value store.
	Few realize there's a Java version. Fewer still realize there are reliable replication
	and leader election functions built in. These capabilities make it possible to implement
	a simple, reliable and scalable embedded application specific database. Oracle
	themselves recently released their Oracle NoSQL Database, which is built on top of
	Berkeley DB Java Edition (BDB JE).

	In 2011, after ruling out other technologies and hardware upgrades, Yammer replaced a
	creaking 10 billion row PostgreSQL messaging database with a home grown distributed
	database cluster built atop BDB JE. In the process we improved system availability,
	simplified scaling and lowered message delivery latency. Our operations team sleeps better
	at night knowing we have N+1 redundancy.

	We chose an embedded solution after evaluating several client/server stores. In the end,
	our high write fanout on message delivery (imagine an SMTP server attempting to deliver 1
	email message with 50,000 addresses in the Cc header) made it increasingly expensive to
	perform over the network. An embedded solution enables this fanout to occur in-process with
	writes dispatched to a local filesystem.

	I will discuss evaluating whether BDB JE is right for your situation, problems you may run
	into along the way and useful patterns for leveraging BDB JE as an embedded solution to
	building reliable application specific databases at scale. While familiarity with b-trees,
	replication and leader election will be helpful, none of what I discuss will dive into
	territory that requires extensive knowledge of databases or distributed computing.