Skip to content

Instantly share code, notes, and snippets.

@mgedmin
Last active December 13, 2015 16:59
Show Gist options
  • Save mgedmin/4944724 to your computer and use it in GitHub Desktop.
Save mgedmin/4944724 to your computer and use it in GitHub Desktop.
Instructions for converting Zope SVN repositories to Git

Converting Zope SVN repositories to Git

This process has one critical flaw and you probably don't want to use it. git-svn is simpler.

You need:

  • svn-all-fast-export (from the Debian/Ubuntu package of the same name; upstream homepage is http://gitorious.org/svn2git, not related to a Ruby tool of the same name)

    To avoid segfaults due to authz filtering (four revisions on svn.zope.org are not available to the general public: r129027, r129030, r129031, r129032, except, curiously, via the ViewCVS web interfaces, where the filtering is apparently not applied), you want to build your own svn-all-fast-export from https://github.com/mgedmin/svn2git

  • a copy of the Subversion repository

    Good thing I have one set up, using svnsync:

    svnadmin create /stuff/zope-mirror
    svnadmin setuuid /stuff/zope-mirror 62d5b8a3-27da-0310-9561-8e5933582275
    vi /stuff/zope-mirror/hooks/pre-revprop-change
      # see the http link above
    svnsync init file:///stuff/zope-mirror svn://svn.zope.org/repos/main/
    svnsync sync file:///stuff/zope-mirror
    # repeat last command periodically
    

    It needs about 3.3 gigs of disk space.

  • a copy of authors.txt that maps svn usernames to real names and emails (ask Tres or Jim; Marius has a copy too but he isn't going to share it without explicit permission of the Zope Foundation)

  • an empty repository on Github (ask Tres or Stephan or Marius or Jim to create one at https://github.com/zopefoundation)

    NB: after creating the repository make sure you go to Settings -> Teams, and add zopefoundation/developers and zopefoundation/administrators. And set up e-mail.

https://github.com/zopefoundation/zope.githubsupport can do all that for you!

The conversion process goes like this:

  • write a rules.txt like this one I used for zope.dottedname:

    create repository zope.dottedname
    end repository
    
    # feel free to create multiple repositories in one go
    
    # order of matches matters in this file
    # trailing slashes in match rules are very important
    
    match /(zope\.dottedname)/trunk/
      repository \1
      branch master
    end match
    
    match /(zope\.dottedname)/branches/([^/]+)/
      repository \1
      branch \2
    end match
    
    match /(zope\.dottedname)/tags/([^/]+)/
      repository \1
      branch refs/tags/\2
    end match
    
    match /
      # ignore all other projects
    end match
    
  • run svn-all-fast-export --identity-map=authors.txt --rules=rules.txt --stats /path/to/your/zope-svn-mirror

    You can also pass --svn-branches for a slightly more accurate conversion (branch merge commits do not go away, even when the diff is empty), if I understand it correctly.

    And if you pass --add-metadata-notes, you'll get to see svn path and revno attached to a note on each commit. These are shown by git log.

    These notes are easy to lose (git push --all/--tags doesn't push them; git clone doesn't fetch them). Read more about them at http://git-scm.com/2010/08/25/notes.html

    The notes are shown on Github like this: https://github.com/zopefoundation/zope.traversing/commit/c10f103#gitnotes

  • wait a bit

    The first time I ran it it took ~18 wall clock minutes (~4 CPU minutes) and ended in a segfault.

    The second run took 12 wall clock minutes (hot disk cache, I suppose) and also ended in a segfault.

    Then I discovered that if I don't remove the git repository, svn-all-fast-export will resume the process (a few thousand revisions before it crashed, or maybe that just happened to be the last successfully converted commit before the crash), which is considerably faster than starting from scratch.

    I tried to add some min-revision/max-revision based rules to skip the broken commits in my svn mirror, but that didn't fix the segfaults. Luckily, conversion succeeds if I just run the tool twice (without removing intermediate results).

  • inspect ./zope.dottedname for sanity

    I recommend tig as a very nice console-mode interactive git history viewer. Try tig --all. Or, if you prefer a GUI, try gitk --all.

    For an example of things to inspect, e.g., there was a deleted 3.4.1 tag from http://zope3.pov.lt/trac/changeset/80495, which shouldn't have been deleted, according to http://zope3.pov.lt/trac/changeset/80499, so I've re-created the tag from refs/backups/r80495/tags/3.4.1 that was left by the conversion tool:

    git tag 3.4.1 refs/backups/r80495/tags/3.4.1
    

    Sometimes the conversion tool produces strands of unrelated history. tig --all interleaves them which makes this hard to notice. gitk --all shows them separately.

    You can identify all the root commits with

    git log --all --oneline --decorate --max-parents=0
    

    then see which branches began with these with

    git branch --contains $commit_id
    

    and then see what the parent revision of each of these ought to be by looking at the commit note of $commit_id, getting svn path and revno, then looking at http://zope3.pov.lt/trac/log/{PATH}?rev={REVNO}

    If you identify a missing commit parent, you can fix it up by creating a grafts file (info/grafts, each line contains "$commit_id $parent_id ..."), and you can make the connections permanent (I don't think the grafts file survives a git push) by running git-filter-branch with no arguments.

    A good way to check if your authors.txt was complete and correct is to run 'git shortlog --all -s' on the result.

  • if you want to dig deeper, add some more rules:

    match /Zope3/trunk/src/zope/dottedname/
      repository zope.dottedname
      branch monolithic-zope3
    end match
    
    match /Zope3/branches/([^/]+)/src/zope/dottedname/
      repository zope.dottedname
      branch monolithic-zope3-\1
    end match
    
    # Zope/DottedName never existed, this applies to other packages
    match /Zope3/trunk/lib/python/Zope/DottedName/
      repository zope.dottedname
      branch ancient-zope3
    end match
    
    match /Zope3/branches/([^/]+)/lib/python/Zope/DottedName/
      repository zope.dottedname
      branch ancient-zope3-\1
    end match
    

    nuke the old repository (and log-zope.dottedname*), re-run the conversion tool, get a new repo, inspect, don't forget the tag resurrection:

    git tag 3.4.1 refs/backups/r80495/tags/3.4.1
    
  • upload to github:

    git remote add origin git@github.com:zopefoundation/zope.dottedname.git
    git push -u origin --mirror
    
  • remove old code from Subversion:

    svn rm *
    echo 'See https://github.com/zopefoundation/zope.dottedname' > MOVED_TO_GITHUB
    svn add MOVED_TO_GITHUB
    svn ci -m "Moved to github"
    
  • update any buildouts that used to check code out from svn, e.g. wineggbuilder:

    svn co svn+ssh://svn.zope.org/repos/main/zope.wineggbuilder/trunk
    cd zope.wineggbuilder
    vim project-list.cfg
      replace
        zope.dottedname,svn://svn.zope.org/repos/main/
      with
        zope.dottedname,git://github.com/zopefoundation/zope.dottedname.git
    svn ci -m "zope.dottedname moved to github"
    
  • update zopetoolkit too:

    svn co svn+ssh://svn.zope.org/repos/main/zopetoolkit/trunk
    cd zopetoolkit
    vim ztk-sources.cfg
      replace
        zope.dottedname = svn ${buildout:svn-zope-org}/zope.dottedname/trunk
      with
        zope.dottedname = git ${buildout:github}/zope.dottedname
    svn ci -m "zope.dottedname moved to github"
    

Advantages of svn-all-fast-export

  • it's fast (<5 CPU minutes for 129128 svn revisions; I wish I had an SSD on the server that hosts my svn mirror -- speaking of which, I do have an SSD on my laptop, where the conversion takes <4 wall clock minutes!)
  • it can simultaneously convert multiple packages (add more 'create repository/end repository' statements to rules.txt, and extend the match regexps to catch the packages you're interested in)
  • it's very flexible and can handle gnarly repository history, if you write rules for it
  • it was written for and used by the KDE project to convert and explode their humongous svn repository into a multitude of git projects, so it's been stress-tested rather well

Disadvantages of svn-all-fast-export

  • it requires a local copy of the entire subversion repository
  • it doesn't produce error messages if something's wrong, instead it segfaults
  • it doesn't notice commits that copy or move parts of the tree outside of your rules; this means you may be missing commits or entire files added in those commits, if those files weren't modified since. (This is the critical flaw)
@mgedmin
Copy link
Author

mgedmin commented Apr 14, 2015

Alternative method:
git svn clone file://$PWD/zope-mirror/$package $package --stdlayout -A authors.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment