Skip to content

Instantly share code, notes, and snippets.

@jforberg
Last active July 2, 2024 13:44
Show Gist options
  • Save jforberg/86af759c796199740c31547ae828aef2 to your computer and use it in GitHub Desktop.
Save jforberg/86af759c796199740c31547ae828aef2 to your computer and use it in GitHub Desktop.
git-2.45.2
git-2.45.2.tar.gz
# Tarfile benchmark
A simple benchmark comparing the performance of GNU tar vs. python tarfile, when processing the
source code to Git without compression.
Example results for python 3.12.4:
### gnutar
real 0m0.063s
user 0m0.027s
sys 0m0.035s
### tarfile
real 0m2.135s
user 0m0.780s
sys 0m0.448s
#!/bin/bash
set -eu -o pipefail
if ! [ -e git-2.45.2.tar.gz ]; then
wget 'https://www.kernel.org/pub/software/scm/git/git-2.45.2.tar.gz'
fi
tar xf git-2.45.2.tar.gz
tmp_file=/tmp/tar_bench.tar
trap 'rm -f "$tmp_file"' exit
./bench_gnutar.py git-2.45.2 "$tmp_file"
./bench_tarfile.py git-2.45.2 "$tmp_file"
printf '### gnutar'
time ./bench_gnutar.py git-2.45.2 "$tmp_file"
echo
printf '### tarfile'
time ./bench_tarfile.py git-2.45.2 "$tmp_file"
#!/usr/bin/python3
import subprocess
import shlex
import sys
src_dir = sys.argv[1]
out_file = sys.argv[2]
cmd = shlex.split(f'tar c -C "{src_dir}" .')
subprocess.check_call(cmd, stdout=open(out_file, 'wb'))
#!/usr/bin/python3
profile = False
import tarfile
import sys
import os
if profile:
import cProfile
src_dir = sys.argv[1]
out_file = sys.argv[2]
def test():
tf = tarfile.TarFile(fileobj=open(out_file, 'wb'), mode='w')
os.chdir(src_dir)
for path, dirs, files in os.walk('.'):
for fn in files:
tf.add(f'{path}/{fn}')
if profile:
cProfile.run('test()', sort='cumulative')
else:
test()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment