Skip to content

Instantly share code, notes, and snippets.

@codekiln
Last active March 14, 2024 23:40
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save codekiln/643a4283df07f653dda81d7c25f92876 to your computer and use it in GitHub Desktop.
Save codekiln/643a4283df07f653dda81d7c25f92876 to your computer and use it in GitHub Desktop.
Get the checksum of a tarfile by path
def tarhash(tarpath, hash='sha1'):
"""
given a path to a tar file, return a checksum of its
summed / concatenated contents.
tarpath - a tar.gz path string to open
hash - one of the hash methods supported by hashlib
"""
total_hash = hashlib.new(hash)
with open(tarpath, 'rb') as input_file:
tar = tarfile.open(mode="r|*", fileobj=input_file)
chunk_size = 100 * 1024
for member in tar:
if not member.isfile():
continue
f = tar.extractfile(member)
data = f.read(chunk_size)
while data:
total_hash.update(data)
data = f.read(chunk_size)
tar.close()
return total_hash.hexdigest()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment