Etsy started with a monolithic Postgres database and moved to a horizontally sharded MySQL store with a lookup table and master-master replication. Scaling your data store is never easy as just adding more hardware though, even with sharding.
In this talk, we'll go over the pains and perils of sharding databases and how we solved some of the problems in Etsy's second take on sharding, what we call logical sharding. Logical sharding is where we put several databases on one mysqld, as opposed to one database per server. Upon hitting a resource limit on a box (CPU, Memory, Connections, etc) that would normally require individual row migration, we can instead do file level migrations of a database, replicate, then drop previous databases, which is super fast and a cleaner way to delete old unnecessary shard data. We'll also go over how logical sharding benefits other daily operational challenges like schema changes and backups, as well as simplies life for developers and makes life easier for downstream data warehousing.
Etsy is a fast growing site with over 40 million members and 26 million items listed. Learn about Etsy's painpoints in growing a 120TB+ sharded MySQL datastore and how the architecture and tooling evolved over time.