Morendil/citations-game.md

## citations-game.md

      
    Raw
  

              citations-game.md
            
          
    The citations game: Cone of Uncertainty

Rubric: Software Engineering : Factual Claims : Cone of Uncertainty : Empirical Validation
Context

The Wikipedia page on "Cone of Uncertainty" lists two "empirical validations" of Boehm's initial work. I didn't chase these down for Leprechauns. They are:

"Later work by Boehm and his colleagues at USC applied data from a set of software projects from the U.S. Air Force and other sources to validate the model;"
"The basic model was further validated based on work at NASA's Software Engineering Lab (NASA 1990 p. 3-2)."

There is no citation accompanying the first claim, so this would probably deserve one of those "citation needed" tags that Wikipedia is famous for. As to the other, see below.
Timeline

1984, NASA

The citation given in Wikipedia is "NASA (1990). Manager’s Handbook for Software Development, Revision 1. Document number SEL-84-101. Greenbelt, Maryland: Goddard Space Flight Center, NASA, 1990."
This is actually from 1984. The full PDF is here.
The relevant pages seem to be pages 18-19 of the PDF. The text states "Parameter values are derived from experiences with the SEL software development environment summarized in Table 1-1 (an average of 85 percent FORTRAN and 15 percent assembler macros)."
The numbers don't bear much relation to the upper and lower limits claimed for the Cone in the early stages (respectively a factor of 4 and of 1/4th of the original estimates); the uncertainty given in the table for the "requirements" stage is .75, and the formula below the table (multiply or divide by 1+uncertainty) yields bounds of 1.75 and 0.6 times the original estimate.
It's hard, then, to see this as "empirical validation" of the Cone as typically depicted.
The problem with such "studies" is that they assume uncertainty starts high and reduces steadily as we approach the end of a project. Empirical "validation", despite the term, isn't about confirmation, but about showing that a hypothesis has been subjected to tests that are capable of disproving it. You cannot disprove something that you start out assuming, if the data collection (or presentation) is biased by that assumption.
In particular, we should expect to see in the data the patterns that we know from experience turn up in reality: like projects that stay "90% done" for 90% of their duration, where people keep thinking "we're about a month out" for years on end. If we don't see some instances of this pattern, or are looking at something that is unable to even represent such instances, we should be suspicious.
The most relevant way I know to see this pattern is the slip chart, and collecting many of them in a given organization would be a much better empirical basis for making claims about whether, and how, uncertainty reduces over time.