Skip to content

Instantly share code, notes, and snippets.

@ammarshah
Created October 16, 2023 15:21
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ammarshah/5d4e38b3d69c24f6e6cc5b62e2ea2a5a to your computer and use it in GitHub Desktop.
Save ammarshah/5d4e38b3d69c24f6e6cc5b62e2ea2a5a to your computer and use it in GitHub Desktop.
Remove commit(s) from git repository using rebase while keeping the original commit dates

Example Usage of git-rebase with git-filter-branch and git-filter-repo

You can rewrite commits using either git-filter-branch or git-filter-repo but the latter one is recommended. See why.

1. Check Logs

There are two types of dates in a git commit: AuthorDate and CommitDate.

When you run git log, the date you see with every commit is the AuthorDate while GitHub repository shows the CommitDate.

It's important to know the difference between the two and what operations you do that sometimes make one different from the other without you even knowing.

From GitHub docs:

In Git, the author date is when someone first creates a commit with git commit. The commit date is identical to the author date unless someone changes the commit date by using git commit --amend, a force push, a rebase, or other Git commands.

On your profile page, the author date is used to calculate when a commit was made. Whereas, in a repository, the commit date is used to calculate when a commit was made in the repository.

To see logs with both the dates included use --pretty=fuller flag:

git log --pretty=fuller

2. Rewrite Commit History

If you want to remove the very first (oldest) commit, use --root flag:

git rebase -i --root

If you want to remove a commit from the middle of the history:

git rebase -i <SHA_of_commit_made_before_the_one_you_want_to_remove>

If you want to include the reference commit as well in the interactive shell, use ^ after the commit SHA:

git rebase -i <SHA_of_commit>^

If you want to remove the last (latest) commit, use git reset instead:

git reset HEAD^1 --hard # removes one commit from the top

In the above commands, we are using git rebase with the -i flag which will open an interactive rebase file in your default text editor. It will look something like this:

pick abc123 Commit message 1
pick def456 Commit message 2
pick xyz789 Commit message 3

Remove the line corresponding to the commit you want to remove. For example:

pick abc123 Commit message 1
pick xyz789 Commit message 3

Then save and close the editor.

3. Fix Commit Dates (only applies if used rebase)

When you use git rebase, the commit history is rewritten and one of the things it does is rewriting the CommitDate.

Let's say, you removed third commit from the top of the history.

The CommitDate for the two commits made after the one you removed is rewritten with the current date but the AuthorDate is still intact.

With the following commands, you can rewrite CommitDate with the original date i.e. AuthorDate.

If you want to rewrite CommitDates of all the commits:

# Using filter-repo
git filter-repo --commit-callback '
  commit.committer_date = commit.author_date
'

# Using filter-branch
git filter-branch --env-filter '
  GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
'

If you want to rewrite CommitDate of a specific commit:

# Using filter-repo
git filter-repo --commit-callback '
  if commit.original_id == b"<SHA_of_commit>":
    commit.committer_date = commit.author_date
'

# Using filter-branch
git filter-branch --env-filter '
  if test "$GIT_COMMIT" = "<SHA_of_commit>"
  then
    GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
  fi
'

If you want to rewrite CommitDates of a range of commits:

# Using filter-repo
git filter-repo \
  --refs <SHA_of_commit_made_before_the_one_you_want_to_rewrite>..master \
  --commit-callback '
    commit.committer_date = commit.author_date
  '
# or three commits from the top
git filter-repo \
  --refs master~3..master \
  --commit-callback '
    commit.committer_date = commit.author_date
  '

# Using filter-branch
git filter-branch --env-filter '
  GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
' <SHA_of_commit_made_before_the_one_you_want_to_rewrite>..master

# or three commits from the top
git filter-branch --env-filter '
  GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
' master~3..master

Using --force Flag (not recommended)

Warning

Before you use --force flag, I recommend you to read this.

filter-repo

If you get a warning like this:

Aborting: Refusing to destructively overwrite repo history since
this does not look like a fresh clone.
  (expected freshly packed repo)
Please operate on a fresh clone instead.  If you want to proceed
anyway, use --force.

Add --force flag at the end:

git filter-repo --commit-callback '
  commit.committer_date = commit.author_date
' --force

filter-branch

If you get a warning like this:

Cannot create a new backup.
A previous backup already exists in refs/original/
Force overwriting the backup with -f

Add --force flag before the filters:

git filter-branch --force \
  --env-filter '
    GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
  '

4. Push to Remote

Finally, you may need to update your remote branch with the rewritten history but again it's not recommended and you should push your local repository with rewritten history to a new remote repository.

From git-filter-repo docs:

  1. Push your new repository to its new home (note that refs/remotes/origin/* will have been moved to refs/heads/* as the first part of filter-repo, so you can just deal with normal branches instead of remote tracking branches). While you can force push this to the same URL you cloned from, there are good reasons to consider pushing to a different location instead:

    • People who cloned from the original repo will have old history. When they fetch the new history you force pushed up, unless they do a git reset --hard @{u} on their branches or rebase their local work, git will think they have hundreds or thousands of commits with very similar commit messages as what exist upstream (but which include files you wanted excised from history), and allow the user to merge the two histories, resulting in what looks like two copies of each commit. If they then push this history back up, then everyone now has history with two copies of each commit and the bad files have returned. You’re more likely to succeed in forcing people to get rid of the old history if they have to clone a new URL.

    • Rewriting history will rewrite tags; those who have already downloaded tags will not get the updated tags by default (see the "On Re-tagging" section of git-tag(1)). Every user trying to use an existing clone will have to forcibly delete all tags and re-fetch them; it may be easier for them to just re-clone, which they are more likely to do with a new clone URL.

    • Rewriting history may delete some refs (e.g. branches that only had files that you wanted excised from history); unless you run git push with the --mirror or --prune options, those refs will continue to exist on the server. If folks then merge these branches into others, then people have started mixing old and new history. If users had already cloned these branches, removing them from the server isn’t enough; you need all users to delete any local branches based on these refs and run fetch with the --prune option as well. Simply re-cloning from a new URL is easier.

    • The server may not allow you to force push over some refs. For example, code review systems may have special ref namespaces (e.g. refs/changes/, refs/pull/, refs/merge-requests/) that they have locked down.

But if you really need to push your rewritten history to the existing remote repository then you need to force push your changes.

git push -f

Bonus Tips

dry-run Flag

It's always a good practice to first verify the intended changes before you actually commit them and filter-repo provides you a way to do exactly that.

You can run any command without making changes at all by using dry-run flag.

git filter-repo --dry-run --commit-callback '
  if commit.original_id == b"<SHA_of_commit>":
    commit.committer_date = commit.author_date
'

This will give you the following output with the original and filtered file that you can compare to verify your intended changes:

NOTE: Not running fast-import or cleaning up; --dry-run passed.
      Requested filtering can be seen by comparing:
        .git/filter-repo/fast-export.original
        .git/filter-repo/fast-export.filtered

Inspect Commit Objects

Sometimes you may need to see a full commit object with all of its attributes.

With filter-repo you can do that very easily because all of its callback filters take python code as their body.

# Prints full commit objects
git filter-repo --dry-run --commit-callback '
    from pprint import pprint
    pprint(commit.__dict__)
    print("\n--------------------\n")
'

Keep origin Remote

filter-repo deletes origin remote from your repository and there's a good reason for that.

From git-filter-repo docs:

git-filter-repo deletes the "origin" remote to help avoid people accidentally repushing to the same repository, so you’ll need to remind git what origin’s url was.

You can also read the discussion in one of its GitHub issues.

But sometimes it's annoying when you are making a very small change and you know what you are doing. In that case, here's a quick workaround.

Define aliases like this:

alias before-running-filter-repo='git remote rename origin not-origin'
alias after-running-filter-repo='git remote rename not-origin origin'
alias fix-committer-date="git filter-repo --commit-callback '
  if commit.original_id == b\"<SHA_of_commit>\":
    commit.committer_date = commit.author_date
  '
"

Then you can use them like this:

before-running-filter-repo
fix-committer-date
after-running-filter-repo

Or you could combine all of these into one alias:

alias fix-committer-date="
  git remote rename origin not-origin

  git filter-repo --commit-callback '
    if commit.original_id == b\"<SHA_of_commit>\":
      commit.committer_date = commit.author_date
  '

  git remote rename not-origin origin
"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment