Skip to content

Instantly share code, notes, and snippets.

@MXWest
Last active October 18, 2019 18:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save MXWest/80a52b71937e1621cc6dd055e30c18c9 to your computer and use it in GitHub Desktop.
Save MXWest/80a52b71937e1621cc6dd055e30c18c9 to your computer and use it in GitHub Desktop.
SRE Antipatterns

By no means a complete list, but rather ones I think we should focus in on short term.

Antipattern 2: Humans Staring at Screens

If you have to wait for a human to detect an error, you've already lost

Any practice for which the detection of a problem condition
relies on a human noticing that a particular series of data
is abnormal. Substitue thresholds, correlation engines, velocity
metrics, etc.

Antipattern 3: Mob Incident Response

All hands-on-deck incident handling without thought to
coordination of efforts, reserves, and OSHT* troubleshooting,
sleep cycles, human cognitive limits, or the deleterious effect
of interrupts on engineering work.

* 1) Observe the situation, 2) State the problem, 3) Hypothesize
the cause/ solution, 4) Test the solution.

Antipattern 9: Speed-Bump Engineering

Prevention of all errors is impossible, costly, and annoying to anyone trying to get things done.

Any process that increases the length of time between the creation
of a change and its production reelease without either adding value
to or providing definitive feedback on the production impacts of
the change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment