Skip to content

Instantly share code, notes, and snippets.

@hashbrowncipher
Last active August 15, 2022 16:21
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hashbrowncipher/7a87eaa91aa588c3d2fe1c69308eefca to your computer and use it in GitHub Desktop.
Save hashbrowncipher/7a87eaa91aa588c3d2fe1c69308eefca to your computer and use it in GitHub Desktop.
Why contracts within engineering organizations don't work.

One time at work, my team was upgrading an open source search-engine-cum-database that had an unfortunate predilection for breaking its external API. We had already deployed the new version of the database with its breaking changes, and now it was time to herd our customers off of the old version and onto the new version. Our customers were naturally reticent: for most of them it was just a bunch of work for very little reward. The migration would require careful testing, and just generally it didn't sound like a fun time. To top the situation off, some of these customers' services hadn't been touched in years, and the original authors had long since left.

I'm proud to say that my team was significantly more interested in accommodating our customers' needs than some other DBA teams I've worked with or around. During the migration we spent a fair bit of time chewing on ways to lessen the burden we placed on our customers. At one point the possibility of simply "handing off" the outdated search engines was discussed. Our team would no longer provide any of our typical DBA services: no security patches, no scaling, no performance analysis, no customer education, no production debugging, no (new) tooling, no oncall, no deployment integration, etc. We would simply package up whatever we had at that moment, write some documentation about how to use it, and throw it over the wall for our customers to operate indefinitely. Our customers would avoid a costly and inopportune migration effort, and all they had to do was promise to never talk to us about these databases again.

We did not execute this plan, thank goodness. I'm sure many readers will understand instinctually why it would have backfired, but let's spell it out nevertheless.

The problem is that—when you operate within the confines of a single engineering organization—there is no way to make a promise that actually binds its parties. This is a stark contrast to the "real world", where breaking a contract can be expected to produce real legal or reputational issues. The situation remains true even if all parties are operating entirely in good faith. It gets worse when folks are operating in bad faith, but we can leave those ramifications as an exercise to the reader.

The first problem is that the people signing any contract will bind their respective teams only at a single moment in time. A manager of a team who says "yes, please hand over the database clusters; we promise to manage them" can resign, be fired, promoted, or reorged off the team on the very same day. The headcount allocation they expected could disappear, or maybe the IC who promised them "yeah, maintaining databases is easy" loses interest.

Even if the original line-manager remains in place, a new director can easily slot in above them on the org chart and start asking annoying questions, like "Why am I paying two ICs to manage a database? Shouldn't that be the DBA team's job?" At that point it will become a director-to-director chat, and protestations of "But you guys promised!" will carry about as much weight as a handful of wet ramen. Alternatively, your DBA team could bring in an impressionable new director who is deathly afraid of angering their peers, and the situation will unfold in exactly the same way.

And even if none of those things happen, do we honestly expect the customer team not to page the DBA team if-and-when shit hits the fan? Is the customer team really going to tell their leadership: "Yes, we know that our service is causing an embarrassing outage that puts egg on the company's face, but we pinky promised not to bother the subject matter experts in this area"? And, on the other hand, are we really expecting the DBA team members—who care deeply about the success of the company—to sit on their hands while the customer team fumbles a data migration? Even a modest issue could easily result in the DBA team spending 10x more time cleaning up the smoldering ruins of a broken database than they saved by handing it off in the first place. (Data cleanup is not fun, nor easy).

This whole situation produces what economists call a "moral hazard" (defined as "lack of incentive to guard against risk where one is protected from its consequences"): the customer teams know in the back of their minds that, no matter how solemnly they promise, their promises will never be fully enforced against them. As a result, they don't bother making the appropriate investments in their databases.

I often wonder how the results would be different if binding promises could be made. Would any sane manager actually opt to take ownership of their team's own databases, with all of the responsibility that entails? Would a director ever tell their line manager: "well, I'd love to promote you, but I can't because I won't be able to find a replacement willing to abide by this promise you made about your database"? Would a planned reorg ever catch a snag over a topic as inconsequential as who maintains which database? Definitely not: the concept is laughable.

To draw a legal analogy, a contract with a manager of a peer team is about as useful as a contract with an LLC that has no assets. You might become the proud owner of a signed piece of paper, but if push-comes-to-shove there is nothing stopping your counterparty from just declaring bankruptcy and leaving you high and dry.

And so, at the end of the day, we piss off our users by forcing them to do database migrations. We insist the migrations are necessary, even when the database likely could have hummed along, unnoticed, for the remainder of its natural life. We insist even when the service already has a sunset date set. Significant effort is expended by both teams, and that effort doesn't always inure to the benefit of the company. And so I find myself wondering: is there a better way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment