Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Shows you the largest objects in your repo's pack file.
#!/bin/bash
# git-find-large-files
# Shows you the largest objects in your repo's pack file.
# Written for osx.
#
# @see https://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
# @see https://stackoverflow.com/questions/10622179/how-to-find-identify-large-commits-in-git-history/10622293#10622293
# @author Antony Stubbs
# set the internal field separator to line break, so that we can iterate easily over the verify-pack output
IFS=$'\n';
# num_results=10
num_results=25
# Kilobytes:
# size_str="kB"
# size_bytes=1024
# Megabytes:
size_str="MB"
# (1024^2)
size_bytes=1048576
# list all objects including their size, sort by size, take top 10
objects=$(git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head -n "$num_results")
echo "Top $num_results largest files in repo history."
echo "All sizes are in $size_str's. The pack column is the size of the object, compressed, inside the pack file."
output="size,pack,SHA,location,exists"
allObjects=$(git rev-list --all --objects)
for y in $objects
do
# extract the size in bytes
size=$(($(echo $y | cut -f 5 -d ' ')/$size_bytes))
# extract the compressed size in bytes
compressedSize=$(($(echo $y | cut -f 6 -d ' ')/$size_bytes))
# extract the SHA
sha=$(echo $y | cut -f 1 -d ' ')
# find the objects location in the repository tree
location=$(echo "${allObjects}" | grep $sha | cut -f 2 -d ' ')
# does the file currently exist?
exists=$(test -f "$location" && echo 'Y' || echo 'N')
output="${output}\n${size},${compressedSize},${sha},${location},${exists}"
done
echo -e $output | column -t -s ', '
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.