Skip to content

Instantly share code, notes, and snippets.

@nevsan
Last active June 27, 2018 14:22
Show Gist options
  • Save nevsan/d2310750671c178e6ab0e3aef292820e to your computer and use it in GitHub Desktop.
Save nevsan/d2310750671c178e6ab0e3aef292820e to your computer and use it in GitHub Desktop.
git-rebase-submodule
#!/usr/bin/env python
"""Rewrite history for the current branch if a submodule was recently rebased."""
doc = """Rewrite history for the current branch if a submodule was recently rebased.
The Situation
=============
You have made a series of commits on a feature branch, several of which contain updates to a
submodule. As time passes, you realize that you want to rebase yourthis submodule (say, to keep up
with the latest from `develop`). The easy way is to do the rebase, and just commit the result onto
the tip of your branch. However, this will invalidate the submodule revisions in the rest of the
history of your branch. Assuming you only push the rebase, your prior commits will not exist
upstream. If you are happy with this result, you don't need this and can carry on. If this bothers
you, look no further, for we have a solution.
The Solution
============
We assume that you are currently on the branch you would like to fix and that the submodule
in question has just been rebased (so it has the updated/rebased branch currently checked out). We
will then proceed to find all commits starting from the current branch that do not exist in the
submodule's current branch history and generate a mapping between the commits in the top repo that
updated the submodule and the new (rebased) commit in the submodule history that should be used
instead. The end result is a rewritten history of the current branch that contains corrected
references to the corresponding commits in the rebased submodule history.
Limitations
===========
- No attempt was made to get this to work with submodules that don't live at the top of the repo.
- Your mileage may vary if you've done a complex rebase that squashes commits that may have been
used in commits from the top repository.
- The approach uses the authored time of the commit to match commits from the old and new
histories--this works well unless you've actually rewritten these with a commit-filter (unlikely).
- This assumes that the last reference to the submodule before you started your feature branch
exists in the old and rebased history of the submodule--otherwise, you may end up rewriting more
history than you expected
"""
import argparse
import git
def match_commit(a, b):
"""Heuristic for finding the closest match for a particular commit
given that it might have been rebased.
"""
return not bool(a.authored_datetime - b.authored_datetime)
def get_commit_map(repo, submodule):
# All commits that touch the submodule
top_commits = repo.iter_commits(paths=submodule)
# All submodule commits
submodule_commits = submodule_repo.iter_commits()
commit_map = {}
for commit in top_commits:
# Get the revision of the submodule at this commit
sub_commit_id = repo.git.ls_tree(commit, submodule.path).split()[2]
# Get the commit from the submodule
sub_commit_from_top = submodule_repo.commit(sub_commit_id)
# Look for this commit starting from the current branch in the submodule
sub_commit_from_sub = next(
c for c in submodule_commits if match_commit(c, sub_commit_from_top))
if sub_commit_from_sub == sub_commit_from_top:
# We've found a point in history where top and submodule are consistent
break
commit_map[commit] = {'current': sub_commit_from_top, 'new': sub_commit_from_sub}
return commit_map
def parse_arguments():
parser = argparse.ArgumentParser(
description=doc, formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('submodule', help='The submodule that was rebased.')
parser.add_argument('-q', '--quiet', action='store_true', help='Suppress some output')
return parser.parse_args()
if __name__ == '__main__':
args = parse_arguments()
# Setup repos
repo = git.Repo('.')
submodule = repo.submodule(args.submodule)
submodule_repo = submodule.module()
# Get the commit current=>new commit mapping
commit_map = get_commit_map(repo, submodule)
# Print the strategy
if not args.quiet:
for k in reversed(list(commit_map.keys())):
v = commit_map[k]
print('{top!s:.{width}}: {top.message:.{m_width}}'
'\t- {current!s:.{width}}: {current.message:.{m_width}}'
'\t+ {new!s:.{width}}: {new.message:.{m_width}}'.format(
top=k, current=v['current'], new=v['new'], width=7, m_width=15))
# Find the range of commits that we need to modify
try:
root_commit = next(reversed(list(commit_map.keys())))
except StopIteration:
print('Already up to date. No changes necessary')
exit(0)
commit_range = '{!s}^..{!s}'.format(root_commit, repo.head.reference.name)
re_commits = list(repo.iter_commits(commit_range))
# Create a full map of the correct submodule commit for each commit in the range
full_commit_map = {}
current_submodule_commit = None
for commit in reversed(re_commits):
if commit in commit_map:
current_submodule_commit = commit_map[commit]['new']
full_commit_map[commit] = current_submodule_commit
if not args.quiet:
for k, v in full_commit_map.items():
print('{} => {}]'.format(k, v))
if not input('Proceed? [y/n]? ').lower().startswith('y'):
exit(0)
# Execute filter branch
bash_map = 'declare -A COMMIT_MAP=( {} )'.format(
' '.join('[{}]={}'.format(k, v) for k, v in full_commit_map.items()))
index_filter = '{}; echo "160000 echo ${{COMMIT_MAP[$GIT_COMMIT]}}\t{}" | git update-index --index-info'.format(
bash_map, submodule)
out = repo.git.filter_branch('-f', '--index-filter', index_filter, commit_range)
if not args.quiet:
print(out)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment