Skip to content

Instantly share code, notes, and snippets.

@ryfactor
Created April 27, 2022 03:33
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ryfactor/f70529438f254d44c00771765084d5fb to your computer and use it in GitHub Desktop.
Save ryfactor/f70529438f254d44c00771765084d5fb to your computer and use it in GitHub Desktop.
Port CVS repositories to git with reposurgeon

Port CVS repositories to git with reposurgeon

In a rush to change your old CVS repositories to git? Don't be. Take your time and get it right with Eric Raymond's reposurgeon. (But make haste while SourceForge is still up...!)

Some folk believe that reposurgeon is complicated, and that a simple invocation of cvs-fast-export will do. Arguably this is not true, in fact cvs-fast-export is just one of the tools in the reposurgeon toolset, and while it can be used by itself it's really better used as part of the overarching reposurgeon suite. Moreover, using cvssync, cvs-fast-export and git fast-import together means you are using a jumble of command line steps, whereas reposurgeon (built on those tools) entails just setting up some config files and letting it rip. Reposurgeon also has options for updating author info, timezones, character encoding, and more. So reposurgeon is more straightforward than trying to use cvs-fast-export directly, is more featureful than anything else, and as such will do a better job overall.

The docs may look intimidating, so here is a brief guide for using reposurgeon to preserve important old open-source CVS/SVN repositories by converting them to git.

Install reposurgeon

You can do this on your favourite linux, and if you are on windows you can use a minimal linux distro on WSL. It also relies on make, so install that too.

$ sudo apt-get install reposurgeon
$ sudo apt-get install make

Locate your CVS repository

For this example, we are going to port the source code of a Finnish national treasure, "Iter Vehemens ad Necem". IVAN's original sources are located at https://sourceforge.net/p/ivan/code/

Following the link, we land on the usual SourceForge page. We see a message stating SourceForge only supports read-only access to CVS now. It also suggests we have a look at the names of the modules in the CVS repository at http://ivan.cvs.sourceforge.net/ Which gives us the following page:

The IVAN project's CVS data is in read-only mode, so the project may have switched over to another source-code-management system. To check, visit the Project Summary Page for ivan and see if the menubar lists a newer code repository, such as SVN or Git.

The CVS data can be accessed as follows. You can run a per-module CVS checkout via pserver protocol:

cvs -z3 -d:pserver:anonymous@a.cvs.sourceforge.net:/cvsroot/ivan co -P Docs
cvs -z3 -d:pserver:anonymous@a.cvs.sourceforge.net:/cvsroot/ivan co -P igor
cvs -z3 -d:pserver:anonymous@a.cvs.sourceforge.net:/cvsroot/ivan co -P ivan
cvs -z3 -d:pserver:anonymous@a.cvs.sourceforge.net:/cvsroot/ivan co -P testi
You can view a list of files or copy all the CVS repository data via rsync (the 1st command lists the files, the 2nd copies):

rsync -a a.cvs.sourceforge.net::cvsroot/ivan/
rsync -ai a.cvs.sourceforge.net::cvsroot/ivan/ /my/local/dest/dir/
If you are a project admin for ivan, you can request that this page redirect to another repo on your project by submitting a support request.

There it is, the entirety of what the original IVAN devs uploaded. For now, we're interested in the ivan CVS repo. It's worth mentioning at this point that SourceForge has this guide for converting a CVS repo to git: https://sourceforge.net/p/forge/documentation/CVS/

It's nice to compare those steps with what we are going to do here.

Begin by using repotool

The entry point for using reposurgeon is actually repotool, which is a wrapper around the reposurgeon tool. It helps you configure the reposurgeon environment for doing the conversion by initialising a scratch folder with stub files, essentially providing you with a template. Of course, you can write up a .lift file by hand and use reposurgeon directly, but using repotool will save you a lot of time.

First make a scratch folder in which you will carry out the operation:

user@system:~$ mkdir ivan-reposurgeon-scratch
user@system:~$ cd ivan-reposurgeon-scratch
user@system:~/ivan-reposurgeon-scratch$ repotool initialize ivan

You'll be prompted for the source version control system (VCS), so type cvs, and for the destination VCS type git. ls and you will find the directory has been populated by a bunch of stub files. You'll open these to set some configurations.

user@system:~/ivan-reposurgeon-scratch$ repotool initialize ivan
repotool: what VCS do you want to convert from? cvs
repotool: what VCS do you want to convert to? git
repotool: generating Makefile, some variables in it need to be set.
repotool: generating a stub options file.
repotool: generating a stub lift file.
repotool: generating a stub map file.
user@system:~/ivan-reposurgeon-scratch$ ls
ivan.lift  ivan.map  ivan.opts  Makefile

Edit the stub files

Open Makefile and you'll find a bunch of helpful instructions. In this file you'll set the CVS host repository name. This corresponds to the SourceForge CVS repository identified earlier.

For the IVAN project, I edited the following config variables in Makefile. Don't forget to uncomment the REMOTE_URL variable for CVS (necessary to overwrite the one above it for SVN).

CVS_HOST = a.cvs.sourceforge.net
REMOTE_URL = cvs://$(CVS_HOST)/ivan\#$(CVS_MODULE)

Save and exit. Next you can open ivan.lift to add in some additional useful commands.

The IVAN CVS repository is known to have some Latin-1 characters in certain commit messages. We can port these to UTF-8 by adding this command to ivan.lift:

=I transcode latin1

There are also other different encodings available. If you want a high-quality conversion, you cannot do this sort of thing without reposurgeon. While many other useful options are enabled by default in Makefile, I also added lint and gitify for good measure. Putting it all together gives us the following ivan.lift:

# Lift commands for ivan

# Check for and report glitches such as timestamp collisions,
# ill-formed committer/author IDs, multiple roots, etc.
lint

# Massage comments into Git-like form (with a topic sentence and a
# spacer line after it if there is following running text). Only
# done when the first line is syntactically recognizable as a whole
# sentence.
gitify

# In all commit comments containing non-ASCII bytes, transcode from Latin-1.
=I transcode latin1

Then you will want to fetch the CVS repository and map the authors. This is done firstly by running:

$ make stubmap

This does a lot of stuff, such as pulling the CVS repository into an (in our case) ivan.cvs file, creates a mirror folder, and extracts the author names amongst other things.

user@system:~/ivan-reposurgeon-scratch$ make stubmap
repotool mirror cvs://a.cvs.sourceforge.net/ivan#ivan ivan-mirror
(cd ivan-mirror/ >/dev/null; repotool export) | cat >ivan.cvs
cvs-fast-export: warning - branch point V0-310 -> import-1.1.1 matched by date
cvs-fast-export: no commitids before 2006-09-22T10:08:09Z.
reposurgeon "set progress" "read  <ivan.cvs" 'authors write >ivan.map'
17977 cvs events

Subsequently the author map can be edited, which is in ivan.map. I edited it to show e-mail addresses next to the names, but preserved the contributor nicknames. I think it is more historical to do it this way, and pays homage to their humour. If you really want to look up their names they're in the repository under AUTHORS. I also added the timezone, which I estimated should be Europe/Helsinki. This feature of reposurgeon is extremely useful.

So combining usernames, e-mail, and location, we can form the patch for the attibutions by editing ivan.map as follows:

# Author map for ivan
holybanana = holybanana <holybanana@users.sourceforge.net> Europe/Helsinki
hejosa = hejosa <first.last@tut.fi> Europe/Helsinki
kahvi = kahvi <13334086+TuukkaVirtaperko@users.noreply.github.com> Europe/Helsinki

Do the conversion

All it takes is:

$ make

Which does the export step.

user@system:~/ivan-reposurgeon-scratch$ make
reposurgeon "set progress" 'logfile conversion.log' 'script ivan.opts' "read  <ivan.cvs" 'authors read <ivan.map' 'sourcetype cvs' 'prefer git' 'script ivan.lift' 'legacy write >ivan.fo' 'rebuild ivan-git'
17977 cvs events
reposurgeon: These 1 timestamps have multiple commits: 2001-07-28T21:26:09Z
reposurgeon: All commit stamps in this repository are unique.

After this you will find a git folder containing your sparkling new formerly-CVS now-converted-to-git repository in your scratch folder, in this case the folder ivan-git.

user@system:~/ivan-reposurgeon-scratch/ivan-git$ ls
acinclude.m4  config.guess  depcomp    Graphics    IGOR.dsw    IVAN.dsw     Main.dsp      MIHAIL.dsp     NEWS
aclocal.m4    config.sub    Doc        igor        INSTALL     ivanmgw.mak  Makefile.am   MIHAIL.dsw     README
AUTHORS       configure.in  FeLib      igordj.mak  install-sh  LICENSING    mihail        missing        Save
ChangeLog     COPYING       FeLib.dsp  IGOR.dsp    ivandj.mak  Main         mihaildj.mak  mkinstalldirs  Script

Is it all there? Use git log to read a selection of early commit logs:

$ git log --after="2001-08-26" --before="2001-08-30" --oneline
1acf64f9 corrected minor bug in abuseuser...
f0644621 throwing Perttu with lamp crashes game.
ca48db59 Added opening doors by kicking them
dfccea33 stringquestion over message bug.
f2e9c0e3 Hex's changes committed.
6eac010b The Great Wandy Bug killed.
a4a68650 Enemies will be hostile if kicked.
bf35054e Dungeon linking tweaked.
f869f0cb Added kicking enemies.
886bdceb Script changes.

From the prose it certainly sounds like the original IVAN devs wrote this. Huzzah! Our good work is done.

Conclusion

And so we have used reposurgeon with some basic configuration changes to convert a CVS repository to a git repository. This is but a primer for what this tool can do. To do a really bang-up job you'll want to go into further depth, especially if you're curating some important old open-source CVS or SVN repository, and you can read more in the reposurgeon docs at http://www.catb.org/~esr/reposurgeon/repository-editing.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment