You can rewrite commits using either git-filter-branch or git-filter-repo but the latter one is recommended. See why.
There are two types of dates in a git commit: AuthorDate
and CommitDate
.
When you run git log
, the date you see with every commit is the AuthorDate
while GitHub repository shows the CommitDate
.
It's important to know the difference between the two and what operations you do that sometimes make one different from the other without you even knowing.
From GitHub docs:
In Git, the author date is when someone first creates a commit with
git commit
. The commit date is identical to the author date unless someone changes the commit date by usinggit commit --amend
, a force push, a rebase, or other Git commands.On your profile page, the author date is used to calculate when a commit was made. Whereas, in a repository, the commit date is used to calculate when a commit was made in the repository.
To see logs with both the dates included use --pretty=fuller
flag:
git log --pretty=fuller
If you want to remove the very first (oldest) commit, use --root
flag:
git rebase -i --root
If you want to remove a commit from the middle of the history:
git rebase -i <SHA_of_commit_made_before_the_one_you_want_to_remove>
If you want to include the reference commit as well in the interactive shell, use ^
after the commit SHA:
git rebase -i <SHA_of_commit>^
If you want to remove the last (latest) commit, use git reset
instead:
git reset HEAD^1 --hard # removes one commit from the top
In the above commands, we are using git rebase
with the -i
flag which will open an interactive rebase file in your default text editor. It will look something like this:
pick abc123 Commit message 1
pick def456 Commit message 2
pick xyz789 Commit message 3
Remove the line corresponding to the commit you want to remove. For example:
pick abc123 Commit message 1
pick xyz789 Commit message 3
Then save and close the editor.
When you use git rebase
, the commit history is rewritten and one of the things it does is rewriting the CommitDate
.
Let's say, you removed third commit from the top of the history.
The CommitDate
for the two commits made after the one you removed is rewritten with the current date but the AuthorDate
is still intact.
With the following commands, you can rewrite CommitDate
with the original date i.e. AuthorDate
.
If you want to rewrite CommitDate
s of all the commits:
# Using filter-repo
git filter-repo --commit-callback '
commit.committer_date = commit.author_date
'
# Using filter-branch
git filter-branch --env-filter '
GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
'
If you want to rewrite CommitDate
of a specific commit:
# Using filter-repo
git filter-repo --commit-callback '
if commit.original_id == b"<SHA_of_commit>":
commit.committer_date = commit.author_date
'
# Using filter-branch
git filter-branch --env-filter '
if test "$GIT_COMMIT" = "<SHA_of_commit>"
then
GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
fi
'
If you want to rewrite CommitDate
s of a range of commits:
# Using filter-repo
git filter-repo \
--refs <SHA_of_commit_made_before_the_one_you_want_to_rewrite>..master \
--commit-callback '
commit.committer_date = commit.author_date
'
# or three commits from the top
git filter-repo \
--refs master~3..master \
--commit-callback '
commit.committer_date = commit.author_date
'
# Using filter-branch
git filter-branch --env-filter '
GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
' <SHA_of_commit_made_before_the_one_you_want_to_rewrite>..master
# or three commits from the top
git filter-branch --env-filter '
GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
' master~3..master
Warning
Before you use --force
flag, I recommend you to read this.
If you get a warning like this:
Aborting: Refusing to destructively overwrite repo history since
this does not look like a fresh clone.
(expected freshly packed repo)
Please operate on a fresh clone instead. If you want to proceed
anyway, use --force.
Add --force
flag at the end:
git filter-repo --commit-callback '
commit.committer_date = commit.author_date
' --force
If you get a warning like this:
Cannot create a new backup.
A previous backup already exists in refs/original/
Force overwriting the backup with -f
Add --force
flag before the filters:
git filter-branch --force \
--env-filter '
GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE
'
Finally, you may need to update your remote branch with the rewritten history but again it's not recommended and you should push your local repository with rewritten history to a new remote repository.
From git-filter-repo docs:
Push your new repository to its new home (note that refs/remotes/origin/* will have been moved to refs/heads/* as the first part of filter-repo, so you can just deal with normal branches instead of remote tracking branches). While you can force push this to the same URL you cloned from, there are good reasons to consider pushing to a different location instead:
People who cloned from the original repo will have old history. When they fetch the new history you force pushed up, unless they do a git reset --hard @{u} on their branches or rebase their local work, git will think they have hundreds or thousands of commits with very similar commit messages as what exist upstream (but which include files you wanted excised from history), and allow the user to merge the two histories, resulting in what looks like two copies of each commit. If they then push this history back up, then everyone now has history with two copies of each commit and the bad files have returned. You’re more likely to succeed in forcing people to get rid of the old history if they have to clone a new URL.
Rewriting history will rewrite tags; those who have already downloaded tags will not get the updated tags by default (see the "On Re-tagging" section of git-tag(1)). Every user trying to use an existing clone will have to forcibly delete all tags and re-fetch them; it may be easier for them to just re-clone, which they are more likely to do with a new clone URL.
Rewriting history may delete some refs (e.g. branches that only had files that you wanted excised from history); unless you run git push with the --mirror or --prune options, those refs will continue to exist on the server. If folks then merge these branches into others, then people have started mixing old and new history. If users had already cloned these branches, removing them from the server isn’t enough; you need all users to delete any local branches based on these refs and run fetch with the --prune option as well. Simply re-cloning from a new URL is easier.
The server may not allow you to force push over some refs. For example, code review systems may have special ref namespaces (e.g. refs/changes/, refs/pull/, refs/merge-requests/) that they have locked down.
But if you really need to push your rewritten history to the existing remote repository then you need to force push your changes.
git push -f
It's always a good practice to first verify the intended changes before you actually commit them and filter-repo
provides you a way to do exactly that.
You can run any command without making changes at all by using dry-run
flag.
git filter-repo --dry-run --commit-callback '
if commit.original_id == b"<SHA_of_commit>":
commit.committer_date = commit.author_date
'
This will give you the following output with the original and filtered file that you can compare to verify your intended changes:
NOTE: Not running fast-import or cleaning up; --dry-run passed.
Requested filtering can be seen by comparing:
.git/filter-repo/fast-export.original
.git/filter-repo/fast-export.filtered
Sometimes you may need to see a full commit object with all of its attributes.
With filter-repo
you can do that very easily because all of its callback filters take python code as their body.
# Prints full commit objects
git filter-repo --dry-run --commit-callback '
from pprint import pprint
pprint(commit.__dict__)
print("\n--------------------\n")
'
filter-repo
deletes origin
remote from your repository and there's a good reason for that.
From git-filter-repo docs:
git-filter-repo deletes the "origin" remote to help avoid people accidentally repushing to the same repository, so you’ll need to remind git what origin’s url was.
You can also read the discussion in one of its GitHub issues.
But sometimes it's annoying when you are making a very small change and you know what you are doing. In that case, here's a quick workaround.
Define alias
es like this:
alias before-running-filter-repo='git remote rename origin not-origin'
alias after-running-filter-repo='git remote rename not-origin origin'
alias fix-committer-date="git filter-repo --commit-callback '
if commit.original_id == b\"<SHA_of_commit>\":
commit.committer_date = commit.author_date
'
"
Then you can use them like this:
before-running-filter-repo
fix-committer-date
after-running-filter-repo
Or you could combine all of these into one alias
:
alias fix-committer-date="
git remote rename origin not-origin
git filter-repo --commit-callback '
if commit.original_id == b\"<SHA_of_commit>\":
commit.committer_date = commit.author_date
'
git remote rename not-origin origin
"