Skip to content

Instantly share code, notes, and snippets.

@Boggin
Last active September 28, 2022 14:43
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Boggin/2e25abd3a4423ca812671b4bd3f860d0 to your computer and use it in GitHub Desktop.
Save Boggin/2e25abd3a4423ca812671b4bd3f860d0 to your computer and use it in GitHub Desktop.
Using git-filter-repo to rewrite history and remove secrets

Removing Secrets In Your Git Repository

If you have a code-base under source control that is internal to your company, for instance, it's hosted on premises, you may have checked in some secrets in your configuration files. We all know we should have had those encrypted or kept outside of the checked-in source in some way but perhaps it was just better to accrue some technical debt and worry about it later. Okay, now it's later.

git-filter-repo

The following procedure will allow the rewriting of the repository’s history with all the secrets overwritten.

Use scoop (we're on Windows) to install git-filter-repo.

Create a fresh bare clone as a source:
git clone --bare my-repo.git

Create a new working copy as a target:
git clone my-repo my-repo-new

Change directory into the working copy.

Find relevant commits where you've replaced those secrets:
git log --oneline --since=2020-03-01 --author=ThatGuy | Select-String my-tracking-id

Create patch files for the relevant commits:
git log -1 -p -m <commit-hash> > ..\<commit-hash>.patch

Open the patch files and cut-and-paste all the secrets into an expressions.txt file, making regexes where appropriate, i.e. \bPassword[0-9]*\b

Now we are ready to rewrite history:
git filter-repo --replace-text .\expressions.txt --source <dir_clone_bare> --target <dir_clone_wc>

Check the commits again to ensure all the secrets have been replaced (in this case with ***REMOVED***).

It's likely that you are working on this before asking your team to take a break while we swap over the repos. That means you will want to update the bare repo for the final version. That's why we selected a bare repo and used a source and target repo rather than doing it in place on a freshly cloned working copy. Watch how you do this update as git fetch --all isn't what you want. Bare repos don't have the refs for all of the heads for fetch. You need to use git fetch *:* to make sure you update every branch.

The working copy with the rewritten history can now be pushed up to a new origin and shared.

Missing files?

So, then the inevitable happens. There's a missing file. If you are in a case insensitive system like Windows you may have had two versions of a file checked in that only varied in case. git-filter-repo will see two files that are the same and select one, not necessarily the one you wanted. This is easy enough to fix with a git revert of the offending commit followed by a git commit with the corrected file(s).

The following are some useful tools for finding out what has gone wrong.

  • You can check how many files are in the before and the after repos:
    git ls-files | wc -l

  • Pickaxe will find changes in a given string throughout your repo:
    git log -Sthingtolookfor

  • You can alter the width for the filenames in a log message to better see the whole path:
    git log --stat=200

  • Find out which branches a commit appears in:
    git branch --all --contains <commit-hash>

Missing commits?

Now we found, and we don't know why, that some recent commits hadn't made it across. You can compare the history by creating a file with the recent logs from the before and the after repos and then diffing those with, for example, meld.

You then have a couple of options for pulling commits across. You can either create a patch file for the relevant commit(s) and apply that or you can fetch the whole of the old repo's commits and cherry-pick from there.

Using a patch file:
git --git-dir=../<old_repo>/.git format-patch -k -1 --stdout <commit-hash> | git am -3 -k

Using a remote, fetching then cherry-picking:
git remote add my-repo ..\dir-my-repo\.git
git fetch --all my-repo
git cherry-pick <first-missing-sha>..<last-missing-sha>

The patch file method is ideal if you don't want to have all those extra commits in your .git folder. You can just as easily create a patch file for a range of commits, too. If you do have the remote you can always remove it later and purge your local repo.

The cherry-pick method is a little better if you have a slightly more complicated set of commits. For instance, in our project there were a number of merge commits that were empty as the develop branch was already merged into the feature branch before the feature branch was merged into the develop branch. When you are going through the cherry-pick process it will regularly barf and ask you if you want to skip or add the empty commit or reset (git reset is probably what you want). You will also have to look at a some merges when git can't automatically handle the merge and perhaps check against the original repo history to know what to do. Note that when doing a cherry-pick the main branch you are merging in is the remote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment