pbailis/reproducibility.md

## reproducibility.md

      
    Raw
  

              reproducibility.md
            
          
    edit: see http://cs.brown.edu/~sk/Memos/Examining-Reproducibility/
Not deserving of a full post, but nonetheless worth writing about: @ongardie, @aalevy, and a few others on Twitter were surprised by the number of papers that were flagged as "not reproducible" according to the recent study at http://reproducibility.cs.arizona.edu. Digging deeper, it appeared that 1.) "code builds" is the standard for reproducibility in this study and that 2.) many broken builds were the result of missing dependencies on the researchers' systems.
I tried reproducing a few of the authors' "unreproducible" results. It's hard to vet 600+ research code repositories, but, with a little effort (< ~10 minutes each?), I was able to get all of the following to actually build (on Ubuntu 13.10). This doesn't inspire confidence in the reproducibility of the study results.
Peter
pbailis@cs.berkeley.edu


"Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors" in TOCS


Reported build behavior in report: could not find JDK tools.jar to build YCSB


Probable cause of build error: possible JDK misconfiguration


Reproducible? no, was able to build
sudo apt-get install maven openjdk-7-jdk; git clone https://github.com/brianfrankcooper/YCSB.git; cd YSCB; mvn package


"CryptDB: protecting confidentiality with encrypted query processing" in SOSP 2011

Reported build behavior in report: couldn't build MySQL because aclocal was missing
Probable cause of build error: needed aclocal, should have run provided build script
Reproducible? no, was able to build
git clone -b public git://g.csail.mit.edu/cryptdb; cd cryptdb; sudo scripts/install.rb .


"Towards a unified architecture for in-RDBMS analytics" in SIGMOD 2012

Reported build behavior in report: missing postgres.h header
Probable cause of build error: hadn't installed postgresql headers (optional: didn't use provided VM image either)
Reproducible? no, was able to build


wget ftp://ftp.postgresql.org/pub/source/v9.2.7/postgresql-9.2.7.tar.gz
tar -xzf postgresql*
cd postgresql-9.2.7/; ./configure; make; sudo make install
# hack for paths
sudo chown -R ubuntu /usr/local/pgsql
cd ..; wget http://hazy.cs.wisc.edu/hazy/victor/downloads/bismarck.tar.gz; tar -xvf bismarck.tar.gz; cd bismarck;
# change path per http://hazy.cs.wisc.edu/hazy/victor/doc/install_bismarck.php
sed -i 's,/home/bismarckvm/Desktop/dependencies/postgresql,/usr/local/pgsql,' bismarck.path;
source ./bismarck.path; make


"Fast crash recovery in RAMCloud" from SOSP 2011

Reported build behavior in report: missing message.h from ProtoBuf library
Probable cause of build error: needed protobuf library, libraries listed on RamCloud website
Reproducible? no, was able to build


git clone git://fiz.stanford.edu/git/ramcloud.git
# dropped libboost version, added libzookeeper-mt-dev
sudo aptitude install build-essential git-core doxygen libboost-all-dev libpcre3-dev protobuf-compiler libprotobuf-dev libcrypto++-dev libevent-dev libzookeeper-mt-dev
# hack for zk path on ubuntu
sudo ln -s /usr/lib/x86_64-linux-gnu/libzookeeper_mt.a  /usr/local/lib/libzookeeper_mt.a
cd ramcloud; make
# if compiler warnings cause compilation to halt, remove any references to -Werr from Makefile


"Probabilistically Bounded Staleness for Practical Partial Quorums" from VLDB 2012

Reported build behavior in report: "the implementation works"
Probable cause of build error: no idea why this was marked as not building since it's a web demo (though we do have publicly linked patches and not one but two GitHub repos for our actual code)
Reproducible? no, unclear why this was marked as "Build Fails". Moreover, the authors didn't attempt to build the other code or email me (or the other authors). FWIW, our Java repo builds:


sudo apt-get install ant openjdk-7-jdk
git clone https://github.com/pbailis/cassandra-pbs.git
cd cassandra-pbs; ant


"Reusing debugging knowledge via trace-based bug search" from OOSPLA 2012


Reported build behavior in report: "Not sure about the syntax, hence I was getting errors everytime I tried running a query. Otherwise, the tool looks good. Benefit of Doubt can be given to the developer that it is working."


Reproducible? not really--by the description in the report, this should have been marked as "Builds". I'm not sure why it's marked as "Build Fails".