Skip to content

Instantly share code, notes, and snippets.

@dbarjs
Created May 18, 2023 23:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dbarjs/7ee1089e91ba5e2e5293f77d4b766514 to your computer and use it in GitHub Desktop.
Save dbarjs/7ee1089e91ba5e2e5293f77d4b766514 to your computer and use it in GitHub Desktop.

🚀 A blazingly fast shell one-liner 🚀

This shell script displays all blob objects in the repository, sorted from smallest to largest.

For my sample repo, it ran about 100 times faster than the other ones found here. On my trusty Athlon II X4 system, it handles the Linux Kernel repository with its 5.6 million objects in just over a minute.

The Base Script

git rev-list --objects --all |
  git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
  sed -n 's/^blob //p' |
  sort --numeric-sort --key=2 |
  cut -c 1-12,41- |
  $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest

When you run above code, you will get nice human-readable output like this:

...
0d99bb931299  530KiB path/to/some-image.jpg
2ba44098e28f   12MiB path/to/hires-image.png
bd1741ddce0d   63MiB path/to/some-video-1080p.mp4

References: https://stackoverflow.com/questions/10622179/how-to-find-identify-large-commits-in-git-history

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment