Skip to content

Instantly share code, notes, and snippets.

@kohsuke
Last active August 29, 2015 14:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kohsuke/2103f6085663391a6c88 to your computer and use it in GitHub Desktop.
Save kohsuke/2103f6085663391a6c88 to your computer and use it in GitHub Desktop.
Mirrorbrain not serving from fallback

With MirrorBrainDebug On, this is an example of successful redirection:

[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] MirrorBrainEngine On, mirror_base '/srv/releases/jenkins/'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] URI: '/plugins/ldap/1.11/ldap.hpi'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] filename: '/srv/releases/jenkins/plugins/ldap/1.11/ldap.hpi'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] clientip: 85.90.76.97
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Country 'NL', Continent 'EU'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] AS '--', Prefix '--', lat/lng 0.000000,0.000000 state id (null), state '(null)'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Canonicalized file on disk: /srv/releases/hudson/plugins/ldap/1.11/ldap.hpi
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] SQL file to look up: plugins/ldap/1.11/ldap.hpi
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Successfully acquired database connection.
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Found 8 mirrors
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] [mod_mirrorbrain] no distance data - using rank selection
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] same country: ftp.nluug.nl                   (score  100) (rank    2196786) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] same region:  mirrors.clinkerhq.com          (score  100) (rank    1169352) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] same region:  icm.edu.pl                     (score  100) (rank    3925962) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] same region:  jenkins.mirror.isppower.de     (score  100) (rank    7597518) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] elsewhere:    mirror.xmission.com            (score  100) (rank    9489213) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] elsewhere:    ftp-nyc.osuosl.org             (score  100) (rank    4649613) (dist 9999999)
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] classifying 8 mirrors: 0 prefix, 0 AS, 1 country, 3 region, 2 elsewhere
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Chose server ftp.nluug.nl
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Redirect to 'http://ftp.nluug.nl/programming/jenkins/plugins/ldap/1.11/ldap.hpi'

This is an example of failed redirect where cucumber ends up serving the request:

[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] MirrorBrainEngine On, mirror_base '/srv/releases/jenkins/'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] URI: '/opensuse/jenkins-1.611-1.2.noarch.rpm'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] filename: '/srv/releases/jenkins/opensuse/jenkins-1.611-1.2.noarch.rpm'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] clientip: 192.168.160.72
[Mon Apr 27 12:35:55 2015] [error] [client 127.0.0.1] [mod_mirrorbrain] could not resolve country
[Mon Apr 27 12:35:55 2015] [error] [client 127.0.0.1] [mod_mirrorbrain] could not resolve continent
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Country '--', Continent '--'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] AS '--', Prefix '--', lat/lng 0.000000,0.000000 state id (null), state '(null)'
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Canonicalized file on disk: /srv/releases/hudson/opensuse/jenkins-1.611-1.2.noarch.rpm
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] SQL file to look up: opensuse/jenkins-1.611-1.2.noarch.rpm
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Successfully acquired database connection.
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] Found 4 mirrors
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] [mod_mirrorbrain] no distance data - using rank selection
[Mon Apr 27 12:35:55 2015] [warn] [client 127.0.0.1] [mod_mirrorbrain] classifying 4 mirrors: 0 prefix, 0 AS, 0 country, 0 region, 0 elsewhere
[Mon Apr 27 12:35:55 2015] [notice] [client 127.0.0.1] [mod_mirrorbrain] 'opensuse/jenkins-1.611-1.2.noarch.rpm': no usable mirrors after classification. Have to deliver directly.

The current version of mod_mirrorbrain we are running is 2.15.0-1 according to dpkg -l

root@cucumber:/var/log/apache2/mirrors.jenkins-ci.org# dpkg -l | grep mirrorbrain
ii  libapache2-mod-mirrorbrain                              2.15.0-1                                        MirrorBrain is a scalable download redirecto
ii  mirrorbrain                                             2.15.0-1                                        MirrorBrain is a scalable download redirecto
ii  mirrorbrain-scanner                                     2.15.0-1                                        MirrorBrain is a scalable download redirecto
ii  mirrorbrain-tools                                       2.15.0-1                                        MirrorBrain is a scalable download redirecto

In the failed case, notice that mod_mirrorbrain reports that it found 4 mirrors. If you follow the source code, the message comes from L1990:

    if (mirror_cnt > 0) {
        debugLog(r, cfg, "Found %d mirror%s", mirror_cnt,
                (mirror_cnt == 1) ? "" : "s");
    }

From command line psql, I can confirm that these are the four mirrors it finds:

mirrorbrain=> select * FROM filearr WHERE path='opensuse/jenkins-1.611-1.2.noarch.rpm';
  id   |                 path                  |    mirrors    
-------+---------------------------------------+---------------
 27494 | opensuse/jenkins-1.611-1.2.noarch.rpm | {10,13,16,17}
(1 row)

There details are here:

mirrorbrain=> select * FROM server WHERE id=10 OR id=13 OR id=16 OR id=17;
 id |     identifier     |                     baseurl                     |                 baseurl_ftp                 | baseurl_rsync | enabled | status_baseurl | region | country | asn | prefix | score | scan_fpm |           last_scan           |             comment              | operator_name | operator_url | public_notes |      admin       |       admin_email        |  lat   |   lng   | country_only | region_only | as_only | prefix_only | other_countries | file_maxsize 
----+--------------------+-------------------------------------------------+---------------------------------------------+---------------+---------+----------------+--------+---------+-----+--------+-------+----------+-------------------------------+----------------------------------+---------------+--------------+--------------+------------------+--------------------------+--------+---------+--------------+-------------+---------+-------------+-----------------+--------------
 13 | ftp.nluug.nl       | http://ftp.nluug.nl/programming/jenkins/        | ftp://ftp.nluug.nl/pub/programming/jenkins/ |               | t       | t              | eu     | nl      |   0 |        |   100 |      205 | 2015-04-27 12:40:49.692619-04 | Added - Fri Feb 22 12:26:01 2013 |               |              |              | Mike Hulsman     | mike@hulsman.net         | 52.500 |   5.750 | f            | t           | f       | f           |                 |            0
 17 | tsukuba.wide.ad.jp | http://ftp.tsukuba.wide.ad.jp/software/jenkins/ |                                             |               | t       | t              | AP     | JP      |   0 |        |   100 |      146 | 2015-04-27 12:57:48.090618-04 | Added - Fri Dec 26 15:31:52 2014 |               |              |              | Kohei Takahashi  | flast@tsukuba.wide.ad.jp | 36.083 | 140.117 | f            | t           | f       | f           |                 |            0
 10 | icm.edu.pl         | http://ftp.icm.edu.pl/packages/jenkins/         | ftp://ftp.icm.edu.pl/pub/java/jenkins/      |               | t       | t              | eu     | pl      |   0 |        |   100 |      168 | 2015-04-27 13:10:00.209833-04 | Added - Fri Feb 10 14:28:04 2012 |               |              |              | Rafal Maszkowski | rzm@icm.edu.pl           | 52.250 |  21.000 | f            | t           | f       | f           |                 |            0
 16 | mirror.yandex.ru   | http://mirror.yandex.ru/mirrors/jenkins/        |                                             |               | t       | t              | EU     | ru      |   0 |        |   100 |      166 | 2015-04-27 11:49:25.006435-04 | Added - Sat Jun 21 19:44:55 2014 |               |              |              | Arkady L. Shane  | atigro@ya.ru             | 55.752 |  37.616 | t            | f           | f       | f           |                 |            0
(4 rows)

If you follow the source code, mod_mirrorbrain then classifies these mirrors into 5 different kinds of varying preference. "prefix", "AS", "country", "region", and "elsewhere". In this case, none of those four mirrors are usable, because they have various restrictions set --- such as country_only or region_only. So at the end of the classification, none is usable. This is consistent with classifying 4 mirrors: 0 prefix, 0 AS, 0 country, 0 region, 0 elsewhere message.

Mirrorbrain considers using fallback in L2306, but this is checking if the mirrors variable is empty, and thus it fails to take it into consideration that none of those 4 mirrors are actually usable here.

    /* 3rd pass */
    if (apr_is_empty_array(mirrors) && ! apr_is_empty_array(cfg->fallbacks)) {

        debugLog(r, cfg, "ok, need to add fallback mirrors (%d configured)", 
                 cfg->fallbacks->nelts);

So the fallback configuration simply doesn't work. Then in L3503, it fails to find any mirror, (chosen==false) so it decides to handle the request by itself.

    if (!chosen) {
        ap_log_rerror(APLOG_MARK, APLOG_NOTICE, 0, r, 
            "[mod_mirrorbrain] '%s': no usable mirrors after classification. Have to deliver directly.",
            filename);
        setenv_give(r, "file");
        return DECLINED;
    }
    debugLog(r, cfg, "Chose server %s", chosen->identifier);

An important observation here is that if there's absolutely no mirrors found for a given path (such as for very old /war/1.400/jenkins.war), then the fallback kicks in and the redirect happens successfully. This problem only manifests itself if some of the local mirrors have more up-to-date file while our global mirror fails to serve this file. This is presumably why we only started seeing this problem now. Our OSUOSL mirrors have always been up to date, until http:/ftp-chi.osuosl.org/ went down today, so it masked the problem.

Solution

Latest version of mod_mirrorbrain still have the same problem.

I think one relatively easy fix is to ensure there's always one global mirror that can serve any request. So I propose we add archives.jenkins-ci.org as a global mirror but with low priority, as this mirror always have files.

Or to fix this problem properly, we can patch L2306:

    /* 3rd pass */
    if (apr_is_empty_array(mirrors) && ! apr_is_empty_array(cfg->fallbacks)) {

        debugLog(r, cfg, "ok, need to add fallback mirrors (%d configured)", 
                 cfg->fallbacks->nelts);

Instead of apr_is_empty_array(mirrors), we should check the number of classified mirrors. Assuming that this debian package comes from regular sources, we should be able to do this change relatively easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment