Skip to content

Instantly share code, notes, and snippets.

@nhoag
Created November 20, 2015 03:18
Embed
What would you like to do?
Return a list of the largest objects in a Git repo in a human-readable format
#!/bin/bash
HASHES_SIZES=$(git verify-pack -v .git/objects/pack/pack-*.idx \
| sort -k 3 -rn \
| head -20 \
| awk '{print $1,$3}' \
| sort)
HASHES=$(echo "$HASHES_SIZES" \
| awk '{printf $1"|"}' \
| sed 's/\|$//')
HASHES_FILES=$(git rev-list --objects --all \
| \grep -E "($HASHES)" \
| sort)
paste <(echo "$HASHES_SIZES") <(echo "$HASHES_FILES") \
| sort -k 2 -rn \
| awk '{
size=$2; $1="";
$2="";
$3="";
split( "KB MB GB" , v );
s=0;
while( size>1024 ){
size/=1024; s++
} print int(size) v[s]"\t"$0
}'
| column -ts $'\t'
@jp-topps
Copy link

handy little script. thanks for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment