Skip to content

Instantly share code, notes, and snippets.

@warpfork
Last active December 21, 2015 17:59
Show Gist options
  • Save warpfork/6343783 to your computer and use it in GitHub Desktop.
Save warpfork/6343783 to your computer and use it in GitHub Desktop.
comparison of techniques to obtain reproducible and offline builds (ascii table)
```
core transparent multiple permanent bloat
approach offline? fast reproducible when library histories to source repo
---------- -------- ---- ------------ ------------ --------- ---------------
script (clone semantic no/fail no yes (`checkout yes no no
at build time) -f $HASH`)
script (clone semantic yes yes yes yes no no
once, checkout
at build time)
go get wget yes (caching, yes no/fail yes no no
if not smart)
git semantic yes yes yes yes no no
submodules
other vcs-agnostic semantic yes yes yes yes no no
hash tracking
godep/vendoring blob-copy yes yes yes no/fail yes yes/fail
(main
source repo)
docker+deps blob-copy yes yes yes yes yes no
(separate
vendored repo)
```
@warpfork
Copy link
Author

Since the "good" answer for some columns is the affirmative and others in the negative and that's quite confusing at a first read, when there's a clear good answer and a clear fail answer, I've marked the fail as "[yes or no]/fail". (You may debate the importance of the column as you wish, but I think it is fair to note the unit vector.)

The "offline" column is, I hope, self-explanatory.

I included "fast" as a column because it's come up in discussion before, but really, as far as I can tell, "fast" is a synonym of "offline".

The "reproducible" column means it gives every person who pulls the docker repo a reliable way to get exact copies of exact versions of deps that the docker devs intend. (It doesn't mean "reliable in the case of github down", because that's already covered by the "offline" column.)

These columns are the ones of immediate relevance to docker.

The following columns are more about the holistic health of the open source ecosystem around docker:

The "multiple histories" column refers to whether or not third party sources end up with multiple histories. If pty_linux.go has one history in github.com/kr/pty, and then it has another history with different dates, graph, and hashes in docker, then that's multiple histories. If I someday google for some source code in kr/pty/pty_linux.go and I end up on the github source pages for docker, then that's multiple histories.

The "transparent when library" column refers to whether or not a developer who brings the docker source into another project will have to be aware of how docker drags in its deps. If I have a project that submodules docker, and submodules kr/pty, and I can find | grep pty_linux.go | wc -l and I get anything other than '1', you get a fail in this column.

The "permanent bloat to source repo" column describes what kind of diffs end up in your source repo if dependencies are substantially changed -- if the diff is large, and permanently increases the size of cloning the source repo, this column contains a yes; if the diff is small because it's a semantic reference to other sources, this column contains a no.

@ngrilly
Copy link

ngrilly commented May 14, 2015

I don't understand why git submodules are classified in "offline: yes".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment