This script is run with git filter-branch like this:
git filter-branch --tree-filter '/home/roberto/guardian/replace-with-sha.sh /home/roberto/guardian/top-50-biggest-blobs.txt $GIT_COMMIT' -- --all
Using a ramdisk on Ubuntu (big speed increase):
$ mkdir repo-in-ram
$ sudo mount -t tmpfs -o size=2048M tmpfs repo-in-ram
$ cd repo-in-ram
Since writing this gist I've created The BFG Repo-Cleaner, a faster, simpler alternative to git-filter-branch for cleansing bad data out of Git repository history:
The BFG is 10 - 720x faster than git-filter-branch, turning an overnight job into one that takes less than ten minutes.