Skip to content

Instantly share code, notes, and snippets.

@nylander
Created August 9, 2019 10:46
Show Gist options
  • Save nylander/8dddf70bbd43b395362e6a045512f5e5 to your computer and use it in GitHub Desktop.
Save nylander/8dddf70bbd43b395362e6a045512f5e5 to your computer and use it in GitHub Desktop.
MrBayes Issue #93
# gh-93: Repository surgery: Remove large files accidentally added in the past.
- Last modified: fre aug 09, 2019 12:34
- Sign: JN
## Description
Find large, unnecessary files in the MrBayes GitHub repository and delete them.
Saves approx. 132 MB of space (from 212 to 80 MB).
**Note:** I tested this on my own fork (<https://github.com/nylander/MrBayes.git>)
## Links and background reading
- [https://rtyley.github.io/bfg-repo-cleaner/](https://rtyley.github.io/bfg-repo-cleaner/)
- [http://www.ducea.com/2012/02/07/howto-completely-remove-a-file-from-git-history/](http://www.ducea.com/2012/02/07/howto-completely-remove-a-file-from-git-history/)
## Procedure
mkdir clean-mb
cd clean-mb
#### Get java program *bfg repo cleaner*
wget http://repo1.maven.org/maven2/com/madgag/bfg/1.13.0/bfg-1.13.0.jar
bfg_jar="$PWD/bfg-1.13.0.jar"
#### Find large files (installers and old compressed code directories)
git clone git://github.com/nylander/MrBayes.git mb-check
cd mb-check
list_of_rev_ids_for_large_files=$(git verify-pack -v .git/objects/pack/*.idx | sort -k 3 -n | tail -200 | cut -d $' ' -f1)
echo -n > ../files_to_remove
for id in "$list_of_rev_ids_for_large_files" ; do
git rev-list --objects --all | grep "$id" | grep -i "mrbayes-" | cut -d $' ' -f2 | sort -u >> ../files_to_remove
done
cat ../files_to_remove
## MrBayes-3.2.0_installer_MACx64.pkg
## mrbayes-3.2.0_installer_WINx64.msi
## mrbayes-3.2.0_installer_WINx86.msi
## mrbayes-3.2.0.tar.gz
## MrBayes-3.2.1_installer_MAC.pkg
## mrbayes-3.2.1_installer_WINx64.msi
## mrbayes-3.2.1_installer_WINx86.msi
## mrbayes-3.2.1.tar.gz
## MrBayes-3.2.2_MACx64.pkg
## mrbayes-3.2.2.tar.gz
## MrBayes-3.2.2_WIN32_x64.zip
## MrBayes-3.2.3_MACx64.pkg
## mrbayes-3.2.3.tar.gz
## MrBayes-3.2.3_WIN32_x64.zip
## MrBayes-3.2.4_MACx64.pkg
## mrbayes-3.2.4.tar.gz
## MrBayes-3.2.4_WIN32_x64.zip
## MrBayes-3.2.5_MACx64.pkg
## mrbayes-3.2.5_src.zip
## mrbayes-3.2.5.tar.gz
## MrBayes-3.2.5_WIN32_x64.zip
## MrBayes-3.2.6_MACx64.zip
## mrbayes-3.2.6.tar.gz
## MrBayes-3.2.6_WIN32_x64.zip
#### Create a mirror, and remove files
cd ..
rm -rf mb-check
git clone --mirror git://github.com/nylander/MrBayes.git
while read fname ; do
java -jar "${bfg_jar}" --delete-files "$fname" MrBayes.git
done < files_to_remove
cd MrBayes.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
#### Push back
git remote set-url origin https://github.com/nylander/MrBayes.git
git push
> "At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data. It's best to delete all old clones, as they'll have dirty history that you don't want to risk pushing back into your newly cleaned repo."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment