gigamonkey/steele-on-proofs.md

## steele-on-proofs.md

      
    Raw
  

              steele-on-proofs.md
            
          
    Guy Steele on formal proofs from Coders at Work

Seibel: Do you ever try to formally prove your code correct?
Steele: Well, it depends on the code. If I’m writing code with some kind of tricky mathematical invariant I will go for a proof. I wouldn’t dream of writing a sorting routine without constructing some kind of invariant and proving it.
Seibel: Peter van der Linden in his book Expert C Programming has a sort of dismissive chapter about proofs in which he shows a proof of something, but then, ha ha, this proof has a bug in it.
Steele: Yes, indeed. Proofs can also have bugs.
Seibel: Are they at least less likely to have a bug than the code which they are proving?
Steele: I think so, because you come at it in a different way and you use different tools. You use proofs for the same reason you use data types, or for the same reason that mountain climbers use ropes. If all is well, you don’t need them. But they increase the chance of catching it if something does go wrong.
Seibel: I suppose the really bad case would be if you had a bug in your program and then a compensating bug in your proof. Hopefully that would be rare.
Steele: That can happen. I’m not even sure it’s necessarily rare, because you tend to construct the proof to match the structure of the code. Or conversely, if you’ve got a proof in mind as you’re writing the code, that tends to guide your construction of the program. So you really can’t say the code and the proof are totally independent, in the probabilistic sense. But you can bring different tools and different modes of thought to bear.
In particular, the details of the programming tend to take a local point of view and the invariants tend to focus on the global point of view. It’s in doing the proof that you cause those two things to interact. You look at how the local steps of the program affect the global invariant you’re trying to maintain.
One of the most interesting exercises in my career was the time I was asked to review a paper for CACM by David Gries on proving a garbage-collector algorithm correct, a parallel garbage collector. Susan Owicki was a student of Gries’s and she developed some tools for proving parallel programs correct, and he decided to apply these techniques to a version of a parallel garbage collector that had been developed by Dijkstra. The whole piece of code fit on, I think, a half a page. And then the entire rest of the paper was the proof of correctness.
I cranked through the proof and tried to verify every step for myself. And what makes it tricky is, in effect, every statement in the program has the potential to violate any invariant because it’s a parallel program. So the Owicki technique involves cross-checking these at all points. And it took me about 25 hours to go through, and in the process I found a couple of steps I couldn’t push through. So I reported this and it turned out they did represent bugs in the algorithm.
Seibel: So they were bugs in the algorithm which the proof missed since the result of the proof was, Q.E.D. this thing works.
Steele: Yes, the proof presented was a faulty proof. Because something had been overlooked somewhere. It was some detail of formula manipulation—the formula was almost right but not quite. And I think it was a matter of correcting the order—the sequence of two statements or something.
Seibel: So it took you 25 hours to analyze the proof. Could you have found the bug in the code in just 25 hours if you just had the code?
Steele: I doubt I would have even realized there was a bug. The algorithm was sufficiently intricate that I would probably have stared at the code and said, “Yeah, this makes sense to me” and not have spotted this very obscure interaction. It was a multistep sequence that was necessary—a highly unlikely interaction.
Seibel: And so that kind of interaction basically gets abstracted by the process of making the proof so you don’t have to come up with this scenario of what if this happens and then this and then this to realize that there’s a problem.
Steele: Exactly. In effect the proof takes the global point of view and covers all the possibilities, summarizing it in a very complicated formula. And you have to do formula crunching to push it through. So the author resubmitted the paper and it came back for rereviewing and even though I had done the entire exercise, it took me another 25 hours to reverify the proof. This time it all seemed to be sound.
I reported that and the paper was published and nobody’s found a bug in it since. Is it actually bug-free? I don’t know. But I think having gone through the exercise of the proof gives me a lot more confidence that the algorithm is now sound. And I’m hoping that I wasn’t the only reviewer who actually did the complete cranking through of the proof.
Seibel: There’s a Dijkstra quote about how you can’t prove by testing that a program is bug-free, you can only prove that you failed to find any bugs with your tests. But it sort of sounds the same way with a proof—you can’t prove a program is bug-free with a proof—you can only prove that, as far as you understand your own proof, it hasn’t turned up any bugs.
Steele: That’s true. Which is why there is a subspecialty in the discipline having to do with mechanical proof verification. And the hope is that you then reduce the problem to proving that the proof verifier is correct. Which is—if you can write a small enough verifier—actually a much more tractable problem that verifying the proof of any rather large program.
Seibel: And then the manually proved mechanical verifier would do the 25 hours of work you did of grinding out a verification of a specific proof of some other piece of code?
Steele: Yes. Exactly.