Skip to content

Instantly share code, notes, and snippets.

@agarny
Created May 8, 2013 15:30
Show Gist options
  • Save agarny/5541248 to your computer and use it in GitHub Desktop.
Save agarny/5541248 to your computer and use it in GitHub Desktop.
Retrieve and return the n largest files in a git repository
#!/bin/sh
# Retrieve and return the n largest files in a git repository
# Note: adapted from http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
# Usage
function usage()
{
echo "Usage: `basename $0` [-n count]"
}
# Check that we are in the root of a git repository
if [ ! -d .git ]; then
echo "Error: `basename $0` must be called from the root of a git repository"
usage
exit 1
fi
# Retrieve the arguments, if any
count=10
while [ "$1" != "" ]; do
case $1 in
-n )
shift
if [ $# -eq 0 ]; then
usage
exit 1
fi
count=$1
;;
-h )
usage
exit
;;
* )
usage
exit 1
esac
shift
done
# Set the internal field spereator to line break, so that we can iterate easily
# over the verify-pack output
IFS=$'\n';
# Retrieve the n largest files
objects=`git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head -n $count`
# List the n largest files, including their size, pack, SHA and location
output="Size (KB)|Pack (KB)|SHA|Location"
for object in $objects
do
# Extract the size in KBs
size=$((`echo $object | cut -f 5 -d ' '`/1024))
# Extract the compressed size in KBs
pack=$((`echo $object | cut -f 6 -d ' '`/1024))
# Extract the SHA
sha=`echo $object | cut -f 1 -d ' '`
# Find the object location in the git repository tree
location=`git rev-list --all --objects | grep $sha | cut -d " " -f 2`
# lineBreak=`echo -e "\n"`
output="${output}\n${size}|${pack}|${sha}|${location}"
done
echo $output | column -t -s '|'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment