Skip to content

Instantly share code, notes, and snippets.

@zmagg
Created January 17, 2015 06:22
Show Gist options
  • Save zmagg/e7307ea2adf6cd510e95 to your computer and use it in GitHub Desktop.
Save zmagg/e7307ea2adf6cd510e95 to your computer and use it in GitHub Desktop.
scaleconf proposal

Etsy started with a monolithic Postgres database and moved to a horizontally sharded MySQL store with a lookup table and master-master replication. Scaling your data store is never easy as just adding more hardware though, even with sharding.

In this talk, we'll go over the pains and perils of sharding databases and how we solved some of the problems in Etsy's second take on sharding, what we call logical sharding. Logical sharding is where we put several databases on one mysqld, as opposed to one database per server. Upon hitting a resource limit on a box (CPU, Memory, Connections, etc) that would normally require individual row migration, we can instead do file level migrations of a database, replicate, then drop previous databases, which is super fast and a cleaner way to delete old unnecessary shard data. We'll also go over how logical sharding benefits other daily operational challenges like schema changes and backups, as well as simplies life for developers and makes life easier for downstream data warehousing.

Etsy is a fast growing site with over 40 million members and 26 million items listed. Learn about Etsy's painpoints in growing a 120TB+ sharded MySQL datastore and how the architecture and tooling evolved over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment