Skip to content

Instantly share code, notes, and snippets.

@edc
Created August 28, 2015 06:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edc/1c4dba0d1f88fdd5c687 to your computer and use it in GitHub Desktop.
Save edc/1c4dba0d1f88fdd5c687 to your computer and use it in GitHub Desktop.
Find big files committed recently in Git
#!/bin/bash
# Taken from http://gregmac.net/git/2013/06/13/locating-large-objects.html
GIT_CMD=/usr/local/bin/git
RANGE=$1 # like '1.month.ago'
SIZE_THRESHOLD=$2 # size in bytes
# Iterate over a list of commits
for commit in $(${GIT_CMD} rev-list --all --since={${RANGE}} --pretty=oneline | cut -d' ' -f1); do
# Iterate over that commit's blobs
for diffout in $(${GIT_CMD} diff-tree -r -c -M -C --no-commit-id ${commit} | awk '{print $4":"$6}'); do
blob=$(echo ${diffout} | cut -d':' -f1)
filename=$(echo ${diffout} | cut -d':' -f2)
# Skip if this is a file deletion
if [ "$blob" = "0000000000000000000000000000000000000000" ]; then
continue
fi
# Get the blob size
blob_size=$(${GIT_CMD} cat-file -s ${blob})
# Compare it to MAX_SIZE
if [ "${blob_size}" -gt "${SIZE_THRESHOLD}" ]; then
echo "${commit} ${filename} ${blob_size}"
fi
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment