Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Migrate From SVN To GIT

Migrating From SVN to Git

This gist details the following:

  1. Converting a Subversion (SVN) repository into a Git repository
  2. Purging the resultant Git repository of large files

Migrating from SVN to Git is roughly split into three steps:

  1. Retrieve a list of SVN commit usernames
  2. Match SVN usernames to email addresses
  3. Migrate to Git using git-svn clone command

Step 1: Retrieve A List Of SVN Commit Usernames

A SVN commit only lists a user's username. Git on the other hand lists much more details, but at the very least, a git commit author needs both a username and an email address associated to that username. Since the email address is not available in SVN, it needs to be manually matched.

A list of usernames as recorded by SVN therefore needs to be created for the match. The following command will result in a file called authors.txt which will have the SVN usernames as its contents:

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt

Step 2: Match SVN usernames to email addresses

The contents of authors.txt is in the following format:

jwilkins = jwilkins <jwilkins>

It needs to be converted into this:

jwilkins = John Albin Wilkins <>

Step 3: Migrate To Git Using git-svn clone Command

Create a folder where the git clone is to be stored, and then do the following:

git svn clone --stdlayout --authors-file=path/to/authors.txt <svn_repo>

This last step may take some time, but it should result in a Git repo.

##Find And Purge Large Files From Git History

Git (at least GitHub) seems to be stricter than SVN regarding large files. In order to migrate a SVN repository to Git, one may need to purge these files from the Git history.

Step 1: Determine The Files That Are Large

Go to newly created Git repo and do the following:

git rev-list --objects --all | sort -k 2 > allfileshas.txt;git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt

This will result in two files:

  1. allfileshas.txt - a list of all sha's in the git repo
  2. bigobjects.txt - a list of sha's representing objects that are large

To transform these two files into a list of file names and sorted by size in descending order:

for SHA in `cut -f 1 -d\  < bigobjects.txt`; do echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print$1,$3,$7}' >> bigtosmall.txt; done

NOTE: The above script may take a long time (and may never stop), so after 2 minutes (max), just ctr-c stop it.

The resulting file, bigtosmall.txt will contain a list of file names, sorted from largest to smallest.

Step 2: Purge The Files From The Git History

Select files (or even a directory) from bigtosmall.txt that you want purged. Then run the following for each file, substituing MY-BIG-DIRECTORY-OR-FILE with the directory or file that is to be purged:

git filter-branch -f --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all

This comment has been minimized.

Copy link

madhu-onchip commented Aug 30, 2017

could you tell in detail


This comment has been minimized.

Copy link

saurabhperiwal commented Sep 1, 2017

Anyone successful in migrating from SVN to GIT using above process?


This comment has been minimized.

Copy link

pnixon commented Sep 8, 2017

worked for me. one thing it doesn't mention is that you need to install git-svn: sudo apt install git-svn


This comment has been minimized.

Copy link

edcasillas commented Feb 22, 2018

Tried to follow the guide but got stock in the first step. While trying to create the list of commiters, I get this:

awk : The term 'awk' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the
spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:14
+ svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2);  ...
+              ~~~
    + CategoryInfo          : ObjectNotFound: (awk:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

This comment has been minimized.

Copy link

yagnendra commented Feb 25, 2018

Anyone success migrate svn to git by using this above process. ?? Please let me know anything.


This comment has been minimized.

Copy link

tarrynn commented Mar 8, 2018

worked by doing the first 3 steps. nice one!


This comment has been minimized.

Copy link

Mexicoder commented Jul 10, 2018

For anyone with issues with cmd not recognizing "awk" go here:
download the setup you want and install.
Now you need to Update your PATH variable. the dir you need should be "C:\Program Files (x86)\GnuWin32\bin"
Here is the stack post i followed to do it:


This comment has been minimized.

Copy link

chanhlt190290 commented Jul 18, 2018

Work for me. Thanks a lot!


This comment has been minimized.

Copy link

MortInfinite commented Feb 26, 2019

When I run the following command on Windows 10:
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt

I receive the error message:
''' is not recognized as an internal or external command, operable program or batch file.

I have both svn and awk in my PATH variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.