Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save jevinskie/b9bf28b1db0922c469803e0401c1add2 to your computer and use it in GitHub Desktop.
Save jevinskie/b9bf28b1db0922c469803e0401c1add2 to your computer and use it in GitHub Desktop.
Archiving git repos

Size benchmarks of git clone --mirror archival methods

Tested with QEMU at db596ae19040574e41d086e78469014191d7d7fc.

$ git clone --mirror https://gitlab.com/qemu-project/qemu.git
$ ditto qemu.git qemu-nocomp.git
$ gtar -c qemu.git > qemu.git.tar
$ pushd qemu-nocomp.git
$ # We're going to repack the git repo into a single pack file with no zlib compression
$ # so later compression can compress the uncompressed pack file better.
$ git config --add --int pack.compression 0
$ git config --add --int core.compression 0
$ # Does this result in 100% perfect mirror? Not entirely sure...
$ git repack -k -F -f -a -d --window-memory=0 -d -b --pack-kept-objects
$ popd
$ gtar -c qemu-nocomp.git > qemu-nocomp.git.tar
$ # 7zz -version
$ # 7-Zip (z) 23.01 (arm64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
$ # 64-bit arm_v:8 locale=en_US.UTF-8 Threads:10 OPEN_MAX:256, ASM
$ 7zz a -bt -slp -ssw -bb3 -mmt1 -mx9 -ms=on -md=1024m -m0=lzma2 qemu-nocomp.git.tar.7znomt qemu-nocomp.git.tar
$ 7zz a -bt -slp -ssw -bb3 -mmt10 -mx9 -ms=on -md=1024m -m0=lzma2 qemu-nocomp.git.tar.7z qemu-nocomp.git.tar
$ 7zz a -bt -slp -ssw -bb3 -mmt1 -mx9 -ms=on -md=1024m -m0=lzma2 qemu.git.tar.7znomt qemu.git.tar
$ 7zz a -bt -slp -ssw -bb3 -mmt10 -mx9 -ms=on -md=1024m -m0=lzma2 qemu.git.tar.7z qemu.git.tar
$ # p7zip -version
$ # 7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
$ # p7zip Version 17.04 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,10 CPUs LE)
$ 7z a -bt -slp -ssw -bb3 -mmt1 -mx9 -ms=on -md=1024m -m0=lzma2 qemu-nocomp.git.tar.7znomt qemu-nocomp.git.tar
$ 7z a -bt -slp -ssw -bb3 -mmt10 -mx9 -ms=on -md=1024m -m0=lzma2 qemu-nocomp.git.tar.7z qemu-nocomp.git.tar
$ 7z a -bt -slp -ssw -bb3 -mmt1 -mx9 -ms=on -md=1024m -m0=lzma2 qemu.git.tar.7znomt qemu.git.tar
$ 7z a -bt -slp -ssw -bb3 -mmt10 -mx9 -ms=on -md=1024m -m0=lzma2 qemu.git.tar.7z qemu.git.tar
$ # xz --version
$ # xz (XZ Utils) 5.6.0
$ # liblzma 5.6.0
$ xz -z -k -9 -e -vv qemu-nocomp.git.tar
$ xz -z -k -9 -e -vv qemu.git.tar
$ # zstd --version
$ # *** Zstandard CLI (64-bit) v1.5.5, by Yann Collet ***
$ zstd -vvv --progress --no-row-match-finder -M$((1024*32)) -T0 qemu-nocomp.git.tar -o qemu-nocomp.git.tar.zstdstd
$ zstd -vvv --progress --ultra -22 --no-row-match-finder -M$((1024*32)) -T0 qemu-nocomp.git.tar -o qemu-nocomp.git.tar.zstdmt
$ zstd -vvv --progress --ultra -22 --no-row-match-finder -M$((1024*32)) qemu-nocomp.git.tar -o qemu-nocomp.git.tar.zstd
$ zstd -vvv --progress --no-row-match-finder -M$((1024*32)) -T0 qemu-nocomp.git.tar -o qemu-nocomp.git.tar.zstdstd
$ zstd -vvv --progress --ultra -22 --no-row-match-finder -M$((1024*32)) -T0 qemu.git.tar -o qemu.git.tar.zstdmt
$ zstd -vvv --progress --ultra -22 --no-row-match-finder -M$((1024*32)) qemu.git.tar -o qemu.git.tar.zstd
$ ls -lh *.tar*
663M qemu-nocomp.git.tar
350M qemu.git.tar
340M qemu.git.tar.zstdstd
332M qemu.git.tar.xz
330M qemu.git.tar.zstd
330M qemu.git.tar.zstdmt
329M qemu.git.tar.7z
329M qemu.git.tar.7znomt
329M qemu.git.tar.p7z
329M qemu.git.tar.p7znomt
268M qemu-nocomp.git.tar.zstdstd
224M qemu-nocomp.git.tar.xz
223M qemu-nocomp.git.tar.zstd
223M qemu-nocomp.git.tar.zstdmt
209M qemu-nocomp.git.tar.7z
209M qemu-nocomp.git.tar.7znomt
$ ls -lS *.tar*
695367680 qemu-nocomp.git.tar
367001600 qemu.git.tar
356934615 qemu.git.tar.zstdstd
348128456 qemu.git.tar.xz
346519601 qemu.git.tar.zstd
346519601 qemu.git.tar.zstdmt
344577602 qemu.git.tar.7z
344577602 qemu.git.tar.7znomt
344575339 qemu.git.tar.p7z
344575339 qemu.git.tar.p7znomt
281105905 qemu-nocomp.git.tar.zstdstd
234952688 qemu-nocomp.git.tar.xz
233654293 qemu-nocomp.git.tar.zstd
233654293 qemu-nocomp.git.tar.zstdmt
218819332 qemu-nocomp.git.tar.7z
218814936 qemu-nocomp.git.tar.7znomt

TODO

Write stript to handle submodules, automatic pack decompression, and pack recompression after uncompression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment