zmagg/ahi.md

## ahi.md

      
    Raw
  

              ahi.md
            
          
    omg julia,
Today I learned something about the buffer pool in MySQL that made it seem less like performance black magic! So, before today I knew that the buffer pool was an in-memory cache for MySQL that let you do super fast lookups (especially PK lookups, but maybe only PK lookups (? uncertain on that one)), like, our PK lookups with the BP turned on take microseconds, and they did so even before we moved to SSDs for all our databases [1]. I didn't know ANYTHING about how the buffer pool worked though, other than like, "caching: it makes things faster".
So, today we were doing a routine schema change truncating some old tables. We did a TRUNCATE operations live with all the databases still in configuration serving live traffic [2], because a TRUNCATE essentially does a DROP/CREATE at the table level, instead of row by row (since MySQL 5.1, we're on 5.5), so it's usually reasonably high performance. TURNS OUT, it also does some really not-so-performant cache invalidation on the buffer pool! This caused the site to be intermittently unavailable for 20 minutes!
This is also when I learned about the adapative hash index, which made the buffer pool seem less like black magic. The adaptive hash index in MySQL is the index for reads from the buffer pool! Cool! But! If you're going to truncate a table, you're not going to want to run around with stale entries in your buffer pool cache from the old data in the table--so, today, TRUNCATE operations do a acquire a global write mutex of the adaptive hash index and remove the pointers to the stale data in the buffer pool. Anything else that's trying to write to the adaptive hash index queues up, leading to intermittent site inavailability! Also, this was a performance bug that was fixed for DROP TABLE but not yet TRUNCATE (!!!)
I've yet to read the bug fix, gotta get back to my day job now, but this was fun to write up.
[1] in Consumer land, it's been all like, SSDS ARE WONDERFUL USE SSDS EVERYWHERE, and even in commercial land that's been the case for a long time, so, I was super surprised that until late 2015 at Etsy, all our databases were still on SPINNING RUST. Turns out, the cost of having a 2TB (the size we provisioned all our database shards at for Reasons, that I can explain but this is already an aside, so I'll skip the aside otthe aside) SSD was prohibitive until relatively recently. If you wanted density + a good RAID (and we wanted that for our primary datastore), spinning rust it still was.
[2] We do most schema changes not-live, by pulling half the replicants from serving traffic, (thereby, running database traffic only on one replica), running the schema change on the not-live dbs, putting those back into production, and doing the opposite replicant side.