Skip to content

Instantly share code, notes, and snippets.

@piki
Last active March 4, 2024 19:57
Show Gist options
  • Save piki/10d57cda6d5b25744fdeefb56b421fe4 to your computer and use it in GitHub Desktop.
Save piki/10d57cda6d5b25744fdeefb56b421fe4 to your computer and use it in GitHub Desktop.
Script to push (mirror) a large repo to GitHub
#!/bin/bash -ex
#
# Push the current repository to GitHub, in small enough chunks that it
# won't exceed the pack-size limit
# Commit to start with, counting from the oldest. If the process fails,
# you can change this variable to restart from where it failed.
START_COMMIT=1000
# Number of commits to push at a time, counting from the oldest. If a
# push fails because the pack file is too big, try using a smaller number.
COMMIT_STEP=1000
git log --pretty=%H | ruby -e 'puts ARGF.each_line.to_a.reverse' > commits
COMMIT_COUNT=$(wc -l commits | cut -d' ' -f1)
for i in `seq $START_COMMIT $COMMIT_STEP $COMMIT_COUNT`; do
echo ====== $i
COMMIT=$(git show $(head -$i commits | tail -1) | head -1 | cut -d' ' -f2)
git tag -d foo || true
git tag foo $COMMIT
git push -f origin foo
done
git tag -d foo
git push origin HEAD
git push --mirror
@richban
Copy link

richban commented Oct 4, 2022

Thanks for this script. Is my understanding correct that in the for loop it pushes all the commits of the current checkout branch. An in git push --mirror it pushes all the rest missing git objects?

@piki
Copy link
Author

piki commented Oct 4, 2022

Mostly right. It pushes all the commits from the current branch, in batches, under the tag foo. Then git push origin HEAD creates the default branch on the server, which should just push the ref, no objects, since all the objects already got pushed. Then git push --mirror pushes all other branches, including any objects that are only in those branches, not the checked-out branch.

I think I've only tested it with main checked out, but it should work OK with another branch or even a detached head. ymmv.

It could fail if you have an old branch checked out, if main has > 100MB of objects not on the old branch. Less likely, it could fail if any your other branches have > 100MB total of objects that aren't found somewhere on main.

@richban
Copy link

richban commented Oct 4, 2022

Thanks for the clarification!

I have just migrated a 10GB repo from gitlab to github. FIrst I had issues with 100MB size limit - fixed it with git lfs than encountered the issue push >2GB but with your script it worked out like a charm ;)

@nkitagawa-venn
Copy link

Thank you for this @piki ! I was able to use this to mirror a large repo with a long commit history.

FYI: I did notice what appears to be a small bug in the script - it appears that lines 8 and 12 are reversed w.r.t. their comments. (I saw this because I did have to tune the commit step size.)

@piki
Copy link
Author

piki commented Mar 3, 2024

Good catch, @nkitagawa-venn. Fixed it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment