Skip to content

Instantly share code, notes, and snippets.

@natanfelles
Forked from artoodetoo/nogap.php
Created July 29, 2017 16:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save natanfelles/65027ae229fc3dda8bf760833ea80bc4 to your computer and use it in GitHub Desktop.
Save natanfelles/65027ae229fc3dda8bf760833ea80bc4 to your computer and use it in GitHub Desktop.
Efficient Geo IP location in MySQL database

Efficient Geo IP location in MySQL database

Inspired by Jeremy Cole and Andy Skelton articles about efficient MySQL GeoIP implementation.

Obvious but Bad query:

SELECT `code`, `country`
FROM `geoip_countries`
WHERE INET_ATON('4.2.2.1') BETWEEN `ip_from` AND `ip_to`

As rule MySQL cannot use index for such request.

Good query:

SELECT `code`, `country`
FROM `geoip_countries`
WHERE `ip_to` >= INET_ATON('4.2.2.1')
ORDER BY `ip_to` ASC
LIMIT 1

(Alternatively you can use ip_from <= addr and ORDER BY ip_from DESC. But I prefer ascending sort :) )

Ok then. But there is a nuance! GeoIP table had gaps between some ranges and decided to rectify this condition by filling in the gaps with "no country" rows, ensuring that the query would return "no country" instead of a wrong country.

Another bad thing is some tables has not unicode characters. So before import we should convert it. This instruction helps you to fix both

Download and unzip free Maxmind GeoIP files:

  • GeoLite Country,
  • GeoLite City and
  • GeoLite ASN.

Then run from console:

 iconv -f ISO-8859-15 -t UTF-8 GeoLiteCity-Location.csv > GeoLiteCity-Location-UTF8.csv
 iconv -f ISO-8859-15 -t UTF-8 GeoIPASNum2.csv > GeoIPASNum2-UTF8.csv
 
 php nogap.php country GeoIPCountryWhois.csv > GeoIPCountryWhois-nogaps.csv
 php nogap.php blocks GeoLiteCity-Blocks.csv > GeoLiteCity-Blocks-nogaps.csv
 php nogap.php asnum GeoIPASNum2-UTF8.csv > GeoIPASNum2-UTF8-nogaps.csv

Now you are ready to import it into MySQL. There are scheme and data load files for you.

Regards,
Alexandr a.k.a. artoodetoo

#!/usr/bin/env php
<?php
/*
* Filter to fill the IP gaps in a MaxMind GeoLite tables.
*
* For every missing range in the file it puts a dummy one.
*/
$types = [
'asnum' => [0, 0, 1, "%s,%s,\"-\"\n"],
'blocks' => [2, 0, 1, "\"%s\",\"%s\",\"1\"\n"],
'country' => [0, 2, 3, "\"%3\$s\",\"%4\$s\",\"%1\$s\",\"%2\$s\",\"-\",\"-\"\n"]
];
if (!function_exists('bcadd') || !function_exists('long2ip')) {
echo "*** Enable bcadd and long2ip functions\n";
exit(1);
}
if ($argc <= 1 || !isset($types[$argv[1]])) {
echo "Usage: php nogap.php type original-file > new-file\n".
"Where type is one of: 'asnum', 'blocks', 'country'\n";
exit(0);
}
list($skip, $ipFrom, $ipTo, $dummy) = $types[$argv[1]];
if (($in = fopen($argv[2], 'r')) === false) {;
echo "*** Error in open file\n";
exit(1);
}
$out = fopen('php://output', 'w');
// Skip header
for ($i = 0; $i < $skip; ++$i) {
$s = fgets($in);
fputs($out, $s);
}
$prev = '-1';
while (($s = fgets($in)) !== false) {
$data = str_getcsv($s);
$nextFrom = bcadd($prev, '1');
if ($nextFrom !== $data[$ipFrom]) {
$nextTo = bcadd($data[$ipFrom], '-1');
fprintf(
$out,
$dummy,
$nextFrom,
$nextTo,
long2ip($nextFrom),
long2ip($nextTo)
);
}
fputs($out, $s);
$prev = $data[$ipTo];
}
fclose($in);
fclose($out);
exit(0);
DROP TABLE IF EXISTS `geoip_blocks`;
CREATE TABLE `geoip_blocks` (
`ip_from` INT(10) UNSIGNED NOT NULL,
`ip_to` INT(10) UNSIGNED NOT NULL,
`loc_id` INT(10) NOT NULL,
PRIMARY KEY (`ip_from`),
KEY (`ip_to`)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
DROP TABLE IF EXISTS `geoip_locations`;
CREATE TABLE `geoip_locations` (
`id` INT(10) NOT NULL,
`cnt_code` CHAR(2) NOT NULL,
`reg_code` VARCHAR(2) NOT NULL,
`city` VARCHAR(64) NOT NULL,
`zip` VARCHAR(16) NOT NULL,
`latitude` DECIMAL(7,4) NOT NULL,
`longitude` DECIMAL(7,4) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `geoip_countries`;
CREATE TABLE `geoip_countries` (
`ip_from` INT UNSIGNED NOT NULL,
`ip_to` INT UNSIGNED NOT NULL,
`code` CHAR(2) NOT NULL,
`country` VARCHAR(64) NOT NULL,
PRIMARY KEY (`ip_from`),
KEY (`ip_to`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `geoip_asnum`;
CREATE TABLE `geoip_asnum` (
`ip_from` INT UNSIGNED NOT NULL,
`ip_to` INT UNSIGNED NOT NULL,
`asnum` VARCHAR(64) NOT NULL,
PRIMARY KEY (`ip_from`),
KEY (`ip_to`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
LOAD DATA INFILE '/path/to/GeoLiteCity-Blocks-nogaps.csv'
INTO TABLE `geoip_blocks` CHARACTER SET 'UTF8' FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' IGNORE 2 LINES
(`ip_from`, `ip_to`, `loc_id`);
LOAD DATA INFILE '/path/to/GeoLiteCity-Location-UTF8.csv'
INTO TABLE `geoip_locations` CHARACTER SET 'UTF8' FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' IGNORE 2 LINES
(`id`, `cnt_code`, `reg_code`, `city`, `zip`, `latitude`, `longitude`, @ignored, @ignored);
LOAD DATA INFILE '/path/to/GeoIPCountryWhois-nogaps.csv'
INTO TABLE `geoip_countries` CHARACTER SET 'UTF8' FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
(@ignored, @ignored, `ip_from`, `ip_to`, `code`, `country`);
LOAD DATA INFILE '/path/to/GeoIPASNum2-UTF8-nogaps.csv'
INTO TABLE `geoip_asnum` CHARACTER SET 'UTF8' FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
(`ip_from`, `ip_to`, `asnum`);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment