brl/on-reproducibility.md

## on-reproducibility.md

      
    Raw
  

              on-reproducibility.md
            
          
    Tavis Ormandy argues that build reproducibility fails to deliver any
security benefit for the user because in order to verify that a binary can be reproducibly built you need to first build the binary yourself.  At this point you have a binary you can trust, and if you assume you have the source code then any user can produce a trustworthy binary by simply
compiling the source code and this works even if the build is not reproducible.
Ok, but what if we create a system where we nominate trusted entities to verify the build for us?  Tavis points out that if you trust some
third-party more than the vendor then you should download your software from them as well. Once again, reproducibility is irrelevant since the trusted
third party can compile the source code and send you the binary.
Yes, of course you can just build the software yourself, but nobody wants to do that. The core of the argument presumes that the property the user
is most interested in is validating the association between the binary and the source code. Was this the exact source code used to produce this
binary version of the software?
This property is more of a means to an end as build reproducibility is a tool a vendor can use to create accountability for the software they delivers.
If the build infrastructure is compromised or if the vendor inserts some type of malicious backdoor, the tampered binary will be detected if even
one single paranoid user attempts to verify the build themselves. As an additional benefit, the malicious code can be easily isolated and analyzed by
comparing a trusted binary using the same tools that have been created to diagnose reproducibility failures.
Cryptographic signatures are another useful accountability tool with a long history in software distribution.  Since software is often distributed from
servers the vendor does not control and users should not be expected to trust even the vendors own website it's helpful for a user to be able to verify
that the binary package they receive is the same one originally produced and published by the vendor rather than a backdoored copy placed on the download
server by an attacker.
Signatures are not sufficient to guarantee that a user will receive software they can trust since a vendor may either accidentally or deliberately distribute
a malicious package with a valid signature. This could happen because the build server has been compromised, or signing keys have been stolen, the vendor has
been compelled to assist law enforcement in an investigation, or because the vendor is in fact malicious.
Viewing binary reproducibility as an incentive for the vendor to behave and not place secret backdoors in software still leaves the problem of slipping a
backdoored binary to a small target group of users, or the related problem that some platforms and methods of software distribution making it difficult or
inconvenient for a user to examine a software package they intend to install or update. If the risk of detection by an end user is low enough a vendor may
not be adequately discouraged from distributing malicious software by binary transparency alone.
What is also needed is a way to be sure that every user receives the exact same binary version of each software release.  Even in the absence of reproducibility,
this is a useful assurance to have.
This can be accomplished by a system that requires the vendor to make a public commitment (publish a checksum or signature) of each release and for the end-user
to have some way to verify that an appropriate commitment exists prior to installing the software. Ideally, this would be built into the system which installs
software on their computer and happen automatically.
Building on the concept of Certificate Transparency a system of Binary Transparency
has been proposed. Both certificate and binary transparency schemes use a cryptographic construction to create a public audit log of created artifacts which can only
be appended to by the publisher. This log contains hashes of each software package release (or each certificate created by a certificate authority).
I do agree with Tavis that merely making software builds reproducible does not imply much, if anything, about the trustworthiness of the software packages a
user ultimately receives and installs. Software supply chain security is a very difficult problem to solve and I hope it's obvious that with proprietary
closed-source software it cannot be solved at all.
And once a user has been properly assured that a particular software package has been built from a particular collection of source files the very next question
they should ask is: Why should I trust this source code?