Skip to content

Instantly share code, notes, and snippets.

@rjz
Created October 15, 2019 01:58
Show Gist options
  • Save rjz/848dc9a34298d4112897fb29fdf174c7 to your computer and use it in GitHub Desktop.
Save rjz/848dc9a34298d4112897fb29fdf174c7 to your computer and use it in GitHub Desktop.
New Relic FutureTalks 2019-10-14

Performance Antipatterns - Ben Evans (@kittylyst), New Relic

First, what's a pattern?

a model or design used as a guide in needlework and other crafts

-- OED

A model or design: patterns are more abstract than the substrate you're applying them to. Software's pretty abstract already, but design patterns are more abstract still.

And a software pattern?

a general, reusable solution to a commonly occurring problem within a given context.

-- Wikipedia

Or Ben's definition that, "it's a small-scale best practice."

Antipatterns must be the opposite

a common response to a recurring problem that is usually ineffective and risks being highly counterproductive

-- Wikipedia

a pattern that tells how to go from a problem to a bad solution

--C2Wiki

Why do they exist?

  • Software is complex - far more complicated than the most complicated mechanical systems, meaning lots of places for complex behavior to hide

  • Lots of tradeoffs, so easy to optimize for the wrong tradeoff or misunderstand alternatives

  • Software changes over time. Software developers are out to manage the risks associated with change. Yet we forget that software changes from day-to-day and month-to-month--in fact, change is the constant. It's hard enough to map out the state of the system at a point in time. What about many points in time?

  • Teams change over time, too. Software starts in someone's mind. Then they have to explain it to others (a lossy process), passing it through the physical world (via test cases, documentation, etc.) and up into the mind of other humans. "Good luck with that."

And Performance Antipatterns?

They're another beast again.

  • Software seems quantitative. It feels like you should be able to measure it and do "real science" with the software.

  • Complexity breeds subjectivity. "The system is too slow, make it faster!" is an inherently subjective ask from the user. Yet users' view is the only one that matters.

  • Both are true.

Compounding factor? Bad technology choices.

"Why do developers make bad software decisions?". Come down to one of five reasons: boredom, resume padding, peer pressure, lack of understanding, or misunderstanding.

Meet our (Anti-)Patterns

  1. UAT (User Acceptance Testing) is my desktop. "We'd love to do UAT, but it's too expensive. We don't have the hardware." So what do you do? You cobble one together from whatever's lying around. Oh yeah?

    • Track cost of outages and incidents (in opportunity, productivity cost, etc). They're almost always more expensive than the resources needed for UAT.
    • Argument that unrepresentative UAT is better than nothing? Not really true. A JVM running on a desktop with a different number of cores will play differently.
  2. Distracted by shiny. Pure development pattern: the new stuff that everyone wants to play with gets exercised first.

  3. Distracted by simple. Test the easy stuff first. It also tends to be the stuff that's well understood when you should probably be looking at the less-familiar parts.

  4. Production-like data is hard. Dev and production are apples and oranges. Prod is bigger, gnarlier, and totally not represented by dev. Don't underestimate the shape of the data-problem.

    • E.g., betting engine in the UK (~100K bets on a Saturday) seized up when company expanded to Turkey, where many bets take the form of a 'Goliath,' with many different (~800) separate bets originating from a single request. ~30X hit on database because the model wasn't designed for the load.

    • May not be possible to make test data like prod data (think PII). You could try scrambling it, but there's still value (risk) in the shape of obfuscated data. This remains an open problem.

  5. Fiddle with switches. JVM-specific. Team starts changing flags (worth knowing: there are more flags for the JVM GC than flags at the UN, and they get strange fast). Developers get obsessed with the level of control and start trying to change things.

    • You can
    • ...measure in production
    • ...measure in UAT
    • ...change one switch in UAT
    • measure it
    • have someone else double-check your reasoning
    • ...change in prod
    • ...measure again in prod
    • and if it doesn't match the results from UAT, you roll it back.

    One of the hard things in performance testing is figuring out what it's worth in comparison to all the other things a developer needs to do.

  6. Tuning by folklore. Perf tests are boring. They're not about brave knights slaying performance dragons. They're about measurement and statistics. Null hypotheses, T-Tests, all that.

    • "I found these great tips on SO" don't necessarily lead to best practices.
    • Performance tips are workarounds. Tuning addresses problems that already exist, which makes tips a solution in search of a problem.
    • If someone finds the problem and fixes it, a "tip" winds up somewhere between useless and harmful.
    • Tips tend to exist without context. Performance happens in a specific context. E.g. admin manuals contain general advice meant to keep a company from getting sued. Take it as you will.
    • Finally, once a tip's on the Internet, it's there forever. Ask the Python crew how much fun it is tracking down answers for Python2 v. Python3

    Performance tuning is not:

    • tips and tricks
    • secret sauce
    • ...or particularly interesting
  7. The Blame Donkey (or "The Scapegoat"). What gets the blame? It's JMS, Hibernate, etc.--whatever the seniors / management hate. But usually no-one's done the investigation. They've just heard of the problem and jumped on the bandwagon.

    • E.g. quants in financial services, who can program Just Enough to insist on a specific piece of technology--which never ends well.
  8. Micro-analysis. The belief that you can focus on a tiny piece of the system and understand its impact on the overall system. This is even worse with a managed runtime (JVM, V8, Mono, etc).

    An analogy: take a molecule of water and explain how a bucket of water behaves, surface-tension, specific heat, and all. The reality is that only end-user perception matters, and end-user perception is far, far removed from any tiny piece of the system.

What to do?

Treat applications as experiments. You can measure them ("measure, don't guess"). You can analyze the data you collect. You can assess systematic error (accuracy) and random error (precision). We're good at seeing patterns where they don't exist, and the only way to overcome cognitive bias is through data.

Consider:

  • Confirmation bias - we see what we're looking for
  • Reductionist bias - won't understand a reassembled system
  • Action bias - "doing something is better than doing nothing, and since this is something We should do this." -- scary in an outage
  • Clustering illusion - in any random sample you'll see clumps. Those don't necessarily indicate a signal
  • Texas Sharpshooter Fallacy - shoot at random, draw your clusters the next morning? Don't draw conclusions after data has been collected!
  • Disregarding regression to the mean - lots of things get better by themselves.

Why measure? Because humans are bad at guessing and riddled with cognitive biases. We're easily overwhelmed by data and can't easily spot patterns by eye.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment