Skip to content

Instantly share code, notes, and snippets.

@schacon
Created February 11, 2010 18:44
Show Gist options
  • Save schacon/301798 to your computer and use it in GitHub Desktop.
Save schacon/301798 to your computer and use it in GitHub Desktop.

Git Automatic Mirroring

Git automatic mirroring is a system for making Git repositories distribute automatically. It is the first stage of more general P2P features planned.

Client-side Mirroring

With this feature, the git client can be configured to fetch from a (more local, faster, more lightly loaded) URL first. It checks with the real server at the end of the fetch to verify and/or complete the fetch.

The patch works using the refs/mirror/* refs namespace as a temporary fetch area; this space is never resolved automatically, unless you use an explicit name. As no other porcelain will generally show refs in this space, they are a good working space and should provide quite transparent roll-out.

When a mirror is contacted, the refs it is exposing which are named the same as the fetch spec from the original repository, are fetched to refs/mirror/hostname.com/XXX; XXX might be heads/master or tags/v0.04 - only refs which are not known locally are created.

TODO/Wishlist:

enhance previous submission;

if fetch from one mirror fails, it should try the next one. perhaps the same, if SIGINT is received (Ctrl-C)

  • more documentation*

Shawn wanted the fetch to use the same url config as the one for 'git pull'; however this may not be sensible, as the expectation is normally that you do not wish to push to all mirrors. the clean-up function could use the revision walker to decide whether to remove refs/mirrors/* refs. It can do this by: marking all of the refs/mirrors/* refs for that remote as "interesting". Then, mark the refs/remotes/* refs, and refs/heads/* refs as "uninteresting". After the revision walk, any refs/mirrors/* ref which was marked "uninteresting" can be removed.

fetch extension

The main thing here is to make sure that a typical clone can happily present the refs of the repository it is tracking. ie, quite complete refs/remotes/origin/* exist, so if the local sets of heads are not the same as upstream, this feature will do what the user wanted and get the upstream versions only.

protocol on request: url=git://blah/blah.git server then pretends to be git://blah/blah.git, by reversing the 'fetch' spec from the remote config ‣ if url given, remap urls according to remote spec only if matching, otherwise just return valid urls

• if url not found, return a list of valid remote URLs

Server Mirroring Support - Mirror List

The first stage of this will not use a signature system, as the above client-side mirroring approach will never result in the wrong data being moved to the final refs/ space.

State Location

Extensions to git-config, and relevant C files etc.

Whether a remote/URL is a mirror: stored as a property of the mirror entry in git config, or as a property of a remote section. Request Protocol

client advertised feature: mirror

server responds by including a list of known mirrors of this repository in the response after refs, if enabled in config. server action

‣ with 'mirror' capability, always return alternate canonical URLs for this repo, first N mirrors and number of more mirrors

‣ returning mirror URLs may be;

• URLs listed in config file

• fixed, random, GeoIP policies

• 'try mirrors first' flag: temporary or permanent indication to compliant clients to prefer better mirrors.

client logic:

‣ send "mirror" capability to server with request to gather info

‣ store 'mirror =' lines, and 'more_mirrors = ' lines in config based on server response, but leave preferred mirror. set "prefer_mirror" in config if permanent "try mirrors first" flag set mirror selection;

• use mirror if indicated by server, command-line or config

• if preferred mirror selected in config, try that (or those) first,

• otherwise take first/next mirror, but allow user to interrupt

∘ integrate with geoip, netselect for building preferred mirrors

• config to disallow protocol eg git://

• in general try git:// first

• fetch more mirrors if 'more_mirrors = ' is set but exhausted known mirrors on fetch;

‣ contact mirror, pass "url=" string to indicate desired resource

• if mirror capability and returns not found, check list of URLs against known upstream URLs, try again

• if no mirror capability, fetch heads anyway (but they are all placed under refs/mirror/remotename anyway, so doesn't matter that the server isn't 'pretending' to be the correct upstream)

• tags always go under refs/mirror/remotename/tags

‣ contact master URL if haven't in this 'fetch' or fetch returned different heads, fetch refs, and update real remote refs - then cleanup known refs from refs/mirror/remotename

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment