Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Comparing the single-file efficiency of version control systems

Not everything you want to keep in a VCS is a "project": Sometimes you have one single file which does not belong to any other files, but you still want to have it version-controlled. Assume a TODO file, an Office document or something. How well do various version control systems perform here?

Preface:

$ uname -smr 
Darwin 19.3.0 x86_64

$ sccs -V
sccs schily-SCCS version 5.09 2020/01/31 (x86_64-apple-macosx19.3.0)

$ git --version
git version 2.24.1 (Apple Git-126)

$ hg --version
Mercurial Distributed SCM (version 5.3)

$ darcs -V
2.14.2 (release)

We'll start with an empty directory:

$ find .
.

Create our comparison directories:

$ mkdir git
$ mkdir hg
$ mkdir darcs
$ mkdir sccs

Initialize all repositories (where applicable):

$ cd git   ; git init     ; cd ..
$ cd hg    ; hg init      ; cd ..
$ cd darcs ; darcs init   ; cd ..

Enable version ID replacing (where applicable):

$ cd git ; echo "* ident" > .git/info/attributes ; cd ..
$ cd hg  ; echo << EOF > .hg/hgrc                ; cd ..
[extensions]
keyword=
[keyword]
** =
EOF

Create an otherwise empty file in all of the subdirectories which only contains the version ID (or nothing if the VCS does not support keywords):

$ cd git   ; echo '$Id$\n' > foo.txt ; cd ..
$ cd hg    ; echo '$Id$\n' > foo.txt ; cd ..
$ cd darcs ; touch foo.txt           ; cd ..
$ cd sccs  ; echo "%W%\n" > foo.txt  ; cd ..

Version-control that file:

$ cd git
$ git add foo.txt
$ git commit -m "init"
$ cd ..

$ cd hg
$ hg add foo.txt
$ hg commit -m "init"
$ cd ..

$ cd darcs
$ darcs add foo.txt
$ darcs record -a -A foo@bar.com -m "Init"
$ cd ..

$ cd sccs
$ sccs create foo.txt
$ cd ..

Check if everything went smoothly:

$ cd git ; rm foo.txt ; git checkout foo.txt ; cat foo.txt ; cd ..
Updated 1 path from the index
$Id: 98cc5ee8d9f9e2cf0979e19fabeecfab3e65b682 $

$ cd hg ; hg cat foo.txt ; cd ..
$Id: foo.txt,v 97bb6f80127a 2020/04/03 12:08:01 git $

$ cd darcs ; cat foo.txt ; cd ..
# empty

$ cd sccs ; cat foo.txt ; cd ..
@(#)foo.txt	1.1

Count the resulting files:

$ cd git ; find . -type f | wc -l ; cd ..     # 26
$ cd hg ; find . -type f | wc -l ; cd ..      # 26
$ cd darcs ; find . -type f | wc -l ; cd ..   # 14
$ cd sccs ; find . -type f | wc -l ; cd ..    # 3

Look at the sizes:

$ cd git ; ls -Rla | grep -E '\s+\d\s+' | awk '{ print $5 }' | awk '{sum+=$1} END {print sum}' ; cd ..    # 28858
$ cd hg ; ls -Rla | grep -E '\s+\d\s+' | awk '{ print $5 }' | awk '{sum+=$1} END {print sum}' ; cd ..     # 5909
$ cd darcs ; ls -Rla | grep -E '\s+\d\s+' | awk '{ print $5 }' | awk '{sum+=$1} END {print sum}' ; cd ..  # 7438
$ cd sccs ; ls -Rla | grep -E '\s+\d\s+' | awk '{ print $5 }' | awk '{sum+=$1} END {print sum}' ; cd ..   # 849

So version-controlling one single file with a version header in each of the VCSs - not counting cd commands - results in:

  • Git:
    • 5 commands
    • 26 files
    • 28.9 kilobyte
  • Mercurial:
    • 5 commands
    • 26 files
    • 5.9 kilobyte
  • Darcs:
    • 4 commands
    • 14 files
    • 7.4 kilobyte
  • SCCS:
    • 2 commands
    • 3 files
    • 0.85 kilobyte

In conclusion, it seems that using Git to manage single files is not the wisest of all ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.