Skip to content

Instantly share code, notes, and snippets.

@rodluger
Last active June 20, 2018 17:40
Show Gist options
  • Save rodluger/5b82903749720abe568ee7effb54c17d to your computer and use it in GitHub Desktop.
Save rodluger/5b82903749720abe568ee7effb54c17d to your computer and use it in GitHub Desktop.
Removing large files from git history

Cleaning the vplanet repo

Most of what we need to know is here. First we download BFG and create the alias

alias bfg='java -jar bfg-1.13.0.jar'

Note that we need the latest version of the Java Runtime Environment installed.

Let's create a mirror (bare clone) of the bitbucket repo:

git clone --mirror https://bitbucket.org/bitbucket_vpl/vplanet.git

This might take a while! Our repo is pretty big. When that's done, cd into vplanet.git. I found that I had to run

git gc

before doing anything else to force git to re-index the repo. Now we can run the commands described in the examples here. For instance, to remove all files larger than 100 MB from the history, cd out of the repository and run

bfg --strip-blobs-bigger-than 100M vplanet.git

All this did was to flag the offending files -- nothing was deleted. You can check the logs to ensure that nothing bad happened. Once you're happy, run

cd vplanet.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

and then, the super dangerous and final step,

git push
@rodluger
Copy link
Author

rodluger commented Jun 20, 2018

Important: I just realized that by default BFG protects files present in the current commit on master, but not on any of the other branches. So we really should try to get as much as possible onto master before we do this. We can tell BFG to protect files on other branches, but there's going to be so much junk there...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment