Skip to content

Instantly share code, notes, and snippets.

@janv
Last active November 26, 2019 08:45
Show Gist options
  • Save janv/eb553b1debcc8ba7955a953029235e3b to your computer and use it in GitHub Desktop.
Save janv/eb553b1debcc8ba7955a953029235e3b to your computer and use it in GitHub Desktop.
Project management, bug tracking and triage

Project management, bug tracking and triage

This is a summary of https://apenwarr.ca/log/20171213, a very long and interesting article. I'm trying to present here only the ideas that are immediately relevant to our situation.

  • A lot of the article deals with scaling production operations and efficiency and examines the rate of bugs, postulates theories about how to manage projects and bugs and runs some statistical experiments to illustrate the postulated ideas.

  • Efficiency comes from smart project management.

  • Arbitrary, output-oriented goals to be met at the end of some deadline don't work, because they don't create urgency until the last minute, then fail to be effective, leading to a vicious cycle. While managers need goals and targets and estimates to plan ahead internally, estimation and goals should not be communicated to dev teams as such.

  • Do not get into a situation where engineers negotiate schedules with management. Motivation, project estimation, feature prioritization are all psychological games. Understanding the rules helps us turn these excercises into something productive.

  • Student syndrome:

    No matter how far you extend a deadline, a student will always start work at the latest possible moment before it.

Agile: The good parts

  • He picks apart Agile, identifying some of its parts as useful psychological games that let you work smarter.

    • Physical index cards: Make features feel like tangible things, for people who need that. Otherwise useless.

    • Story points: These are useful

    • Pair programming: Helps some people, but generally irrelevant

    • Daily standup meetings: Overhead

    • Strict Prioritization:

      This is a huge one that we'll get to next - and so is flexible prioritization. Since everyone always knows what your priorities are [...], then people are more likely to work on tasks in the "right" order, which gets features done sooner than you can change your mind. Which means when you do change your mind, it'll be less expensive. That's one of the main effects of Agile. Basically, if you can manage to get everyone to prioritize effectively, you don't need Agile at all. It just turns out to be really hard.

    • Tedious Progress Tracking: Not needed. Done right, the progress reports write themselves

    • Burndown charts: A fundamental unit of progress measurement

    • Series of Sprints: Sprints are goals and goals don't work

Strict prioritization

img

  • Each change in a project sets back its progress

  • Lots of early changes are better than a single change that comes late

  • even if your decisions aren't optimal, sticking to them, unless you're completely wrong, usually works better than changing them.

  • This section contains the most important quote:

    If you only take one thing away from reading this talk, it should be that. Make decisions and stick to them.

    Everything else are just methods that help doing that.

  • One of the great product management diseases is that we change our minds when the facts don't. There are so many opinions, and so much debate, and everyone feels that they can question everyone else (note: which is usually healthy), that we sometimes don't stick to decisions even when there is no new information. And this is absolutely paralyzing.

  • If you want to know what Tesla does right and most of us do wrong, it's this: they ship something small, as fast as they can. Then they listen. Then they make a decision. Then they stick to it. And repeat.

    They don't make decisions any better than we do. That's key. It's not the quality of the decisions that matters. Well, I mean, all else being equal, higher quality decisions are better. But even if your decisions aren't optimal, sticking to them, unless you're completely wrong, usually works better than changing them.

Kanban: The good parts

  • Kanban comes from the Japanese car industry of the 50's. They used index cards, but differently than we do in Software

  • Kanban also uses stories and story points

  • Kanban, like agile gives the huge benefit of strict prioritization. The difference:

    • Agile makes you do things in a certain order

    • Kanban makes you do fewer things at a time. The idea here is that inventory is expensive, so you build things just in time.

      unreleased software is inventory. It's very expensive, it slowly rusts in the warehouse, and worst of all, it means you produced work in the wrong order.

      [...]

      Your buildup of low-priority inventory is slowing down the people working on the high-priority things, and that's unacceptable.

  • He gives a few examples for this

  • We're engineers, so we're pretty smart. If we want, we could just, you know, dispense with the psychological games, and decide we're going to strictly prioritize and strictly limit multitasking. It takes some willpower, but it can be done. I happen to be terrible at it.

    [...]

    It's actually really, really hard. It's one of the hardest things in all of engineering. Most people are very bad at it, and need constant reminders to stop multitasking and to work on the most important things. That's why these psychological games, like sprints (artificial deadlines) and index cards and kanban boards were invented. But if we want to become the best engineers we can be, we have to move beyond tricking ourselves and instead understand the underlying factors that make our processes work or not work.

Restricted multitasking

img

  • Another simulation looks at the effects of multitasking, by putting work against value, to demonstrate how multitasking delays delivering value to the customer.

  • One important aspect of this simulation that

    • every features produces work (tasks/bugs), whether launched or not, to reflect changing market conditions or other influences that happen even to unlaunched features.
    • Launched features aren't done, they also keep generating bugs and tasks, but at the same time delivering value.
  • All the features you have released, together, over time, generate a constant workload, a flat burn-down chart:

    At that point, you either need to fundamentally change something - can you make your code somehow require less maintenance? - or get a bigger team. That goes back to our headcount scalability slides from the beginning. Is your code paying for itself yet? That is, is more value being accrued than the cost of maintenance? If so, you can afford to invest more SWEs. If not, you have to cancel the project or figure out how to do it cheaper. Scaling up a money loser is the wrong choice.

Work rates are non-negotiable

  • You can not negotiate how long something takes, how fast people will work

  • You can have a conversation about which features go into a release and which won't

  • You still want to have that conversation as early as possible

  • Knowing the work rate and the cost of features of, managers can negotiate with sales, marketing or biz-dev about feature requirements and deadlines.

  • And then, crucially, they will not tell those dates to the engineering team. The engineering team doesn't need to know. That would be setting a goal, and goals are bad. The engineers just do the work in priority order, and don't multitask too much, and let statistics handle things.

Putting things into practice

Stories

  • What is a story?
    • A small bit of useful functionality delivered to a customer
    • The customer must actually be impacted (they might not notice, like when reducing downtime)
    • It doesn't have to be written like a story ,but the point is: You can't tell a story without the main character (the customer)
    • (A bug is not a story)
  • Personas are fine for UX design, for engineers, something more succinct is helpful: "User will be able to search for emails by keyword, and the results will be returned in no more than 2000ms, and results will be ranked by relevance"
  • The main point about stories is that they involve the customer
  • Dev might have to do 10 things to deliver value to a customer, that do not directly affect the customer. Then you have 10 bugs or tasks making up one story/feature

Story points

  • These make estimation and burndown charts possible

  • People are good at relative estimates, not at absolute estimates

  • Absolute estimates are goals we're setting ourselves

  • Don't let engineers know the due date

  • Points bypass that:

    Nobody sets a "goal" that the project will take 5 points to complete. What does that even mean? It's a five point story, it will always take 5 points to complete, no matter how long that turns out to be. It is 5 points, by definition.

  • Get to point estimates by doing planning poker

    • Great disparity: discuss/revote
    • Alternative: always pick the higher number. Consistent Biases dont matter, over time that will just factor into the velocity. Beware of inconsistencies
  • Interesting: nobody has a vested interest in a particular story having a particular number of points.

    Instead of fighting to be right about the exact size, people can instead focus on why two people have such a widely varying (at least two fibonacci slots) difference of opinion. When that happens, usually it's because there's missing information about the scope of the story, and that kind of missing information is what really screws up your estimates if you don't resolve it. The ensuing discussion often uncovers some very important misunderstanding (or unstated assumptions) in the story itself, which you can fix before voting again.

Tracking the sequence of stories

  • Just use a spreadsheet. Stories should be big, there shouldn't be that many that you need a special tool

  • You want estimation to be so quick and easy that you end up estimating a lot of tasks that never get scheduled - because when the PM realizes how much work they are, they realize there's something more effective to be working on.

  • The spreadsheet is just a bunch of rows that list the stories and their estimates, but most importantly, the sequence you're planning to do them in. I suggest working on no more than one at a time, if at all possible. Since each one, when implemented, is made up of a bunch of individual tasks (bugs), it is probably possible to share the work across several engineers. That's how you limit multitasking, like kanban says to do. If your team is really big or your stories are small, you'll have to work on several stories at a time. But try not to.

Stories are not bugs

Stories Bugs
Slow Fast
Infrequent Numerous
Controversial (PM, execs) Boring
Can be tracked on index cards Need automated tracking
  • Break planning into two layers of abstraction
  • In this model, stories are a bit bigger than usual
  • Bugs/Tasks are small and Stories are made up of several of them
  • Division of labour
    • Stories
      • PM come up with stories
      • Engineers estimate stories
      • PM sequence stories
      • Engineers work on them in order
    • Bugs
      • PM don't care
      • Engineers can work on them in whatever order they want, with whatever level of multitasking they think is appropriate
  • Do not estimate bugs
  • This makes bug fixing a first class activity. Most other agile methods treat bugs as overhead
  • Every bug is on average the same sizeimg
    • Article goes into a lot of detail on this one, with also some real-world examples, not just the simulation
    • Bug creation and resolution rates are always essentially constant. But you need to make sure that the resolution rate is at least as high as the creation rate, otherwise you'll always play catch-up. There's no way out of this in the long term, bug bankruptcies are not a solution.

Triage

A lot of the advice in here doesn't really apply to us, as we don't seem to be drowning in bugs. But I still find it to contain useful ideas.

  • Dealing with the inevitability that you can't fix bugs as fast as they get found
  • In that situation, deal only with the bugs that really matter. How do you decide which ones matter?
  • Priorities
    • Highest prio is for pager incidents
    • Next ist for urgent problems
    • Then there's a prio for "bugs we should probably fix"
    • Finally, one or two levels for "bugs we should fix but obviously never will"
  • Customer psychology:
    • "Won't fix"/"Obsolete" & closing a bug makes people angry
    • Leaving them open on low priority is fine

Fixing vs. Triaging

Fixing bugs Triaging bugs
Slow 100x faster
Requires expertise Easy to parallelize
You can't fix them all You can triage them all
Fix it right the first time Expect occasional re-triage
You need a system to handle the inevitable backlog No big deal if you do a little every day
  • Trying to assign every bug to a person makes the assignment field lose meaning. Don't initially do this
  • Have a system of sorting bugs into component. These are only relevant for triage, not for actually working off the lists
  • Don't create too many components, only one per Triage Team
  • Use labels instead, to track
    • Triage status
    • "Needs Discussion" status
    • Release/Sprint/Milestone sequence
    • Feature backlogs: One per major feature area
  • Don't feel pressured to assign every bug to someone
  • Real advice
    • Never look at the project-wide tracker. Learn how to query intelligently
    • Components are almost useless. They are only good for helping end users point bugs at the right triage team (This is not really applicable to us).
    • Triage team queries for bugs that haven't been triaged and assign them to a Milestone, story or some other hotlist, where they can be picked up by devs
  • Re-triage
    • Outlines a process for periodically re-triaging old bugs instead of closing them all
    • Again, this is only relevant to projects with thousands of bugs
  • Needs discussion
    • For when the triage team doesn't have enough information to triage and needs more info to reproduce the bug
    • Separate from "Needs Triage". This prevents bugs waiting for more information from popping up in when looking at the "Needs Triage" list

TL;DR

  • Goals and deadlines for engineers lead to student syndrome: nothing is urgent until shortly before the deadline
  • Work rates are mostly constant
  • Estimations are good for managers to get an idea for when a project will be done, to perform internal scheduling
  • Late changes are more expensive than early ones. It's better to stick with a sub-optimal decision than to introduce changes late.
  • Have clear priorities
  • Avoid multitasking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment