Skip to content

Instantly share code, notes, and snippets.

@bindiego
Forked from matsukaz/01.README.md
Last active January 29, 2021 03:46
Show Gist options
  • Save bindiego/d65a1ea324151e46f4c212e3128d95e0 to your computer and use it in GitHub Desktop.
Save bindiego/d65a1ea324151e46f4c212e3128d95e0 to your computer and use it in GitHub Desktop.
Find a geolocation of an IP address in BigQuery

This query is to find geolocation of an IP address including latitude, longitude, city and country.

Legacy SQL doesn't support range conditions such as BETWEEN when using JOIN, so we need to filter data by WHERE. This means if IP address does not match any of the data inside geolite_city_bq_b2b, records will not be able to receive.

Use Standard SQL if you want to receive records no matter you succeed to find geolocation or not.

Please refer to the following post for more detail.

https://cloudplatform.googleblog.com/2014/03/geoip-geolocation-with-google-bigquery.html

https://cloud.google.com/blog/products/data-analytics/geolocation-with-bigquery-de-identify-76-million-ip-addresses-in-20-seconds

1 192.188.171.9
2 202.41.146.198
3 103.83.231.23
4 202.64.101.91
5 43.240.52.112
6 103.12.247.112
7 217.228.79.15
8 35.10.42.69
9 86.40.207.143
10 66.71.60.27
11 72.136.125.125
12 68.144.88.194
13 194.197.79.18
14 69.119.90.108
15 83.235.250.67
bq mk <dataset>.log id:integer,ip:string
$ bq load --source_format=CSV <dataset>.log test_data.csv
#standardSQL
SELECT
id,
IFNULL(city, 'Other') AS city,
IFNULL(countryLabel, 'Other') AS countryLabel,
latitude,
longitude
FROM (
SELECT
id,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(ip)) AS clientIpNum,
TRUNC(NET.IPV4_TO_INT64(NET.IP_FROM_STRING(ip))/(256*256)) AS classB
FROM
`<project>.<dataset>.log` ) AS a
LEFT OUTER JOIN
`fh-bigquery.geocode.geolite_city_bq_b2b` AS b
ON
a.classB = b.classB
AND a.clientIpNum BETWEEN b.startIpNum AND b.endIpNum
ORDER BY
id ASC
Row id city countryLabel latitude longitude
1 1 Other Other null null
2 2 Taiwan 23.5000 121.0000
3 3 Other Other null null
4 4 Hong Kong 22.2500 114.1667
5 5 Japan 35.6900 139.6900
6 6 Bangladesh 24.0000 90.0000
7 7 Heiden Germany 51.8333 6.9333
8 8 East Lansing United States 42.7283 -84.4882
9 9 Buncrana Ireland 55.1333 -7.4500
10 10 Collegeville United States 40.1879 -75.4254
11 11 Canada 60.0000 -95.0000
12 12 Calgary Canada 51.0833 -114.0833
13 13 Helsinki Finland 60.1756 24.9342
14 14 Bellmore United States 40.6726 -73.5364
15 15 Greece 39.0000 22.0000
#legacySQL
SELECT
id,
city,
countryLabel,
latitude,
longitude
FROM (
SELECT
id,
INTEGER(PARSE_IP(ip)) AS clientIpNum,
INTEGER(PARSE_IP(ip)/(256*256)) AS classB
FROM
[<project>:<dataset>.log] ) AS a
LEFT OUTER JOIN
[fh-bigquery:geocode.geolite_city_bq_b2b] AS b
ON
a.classB = b.classB
WHERE
a.clientIpNum BETWEEN b.startIpNum AND b.endIpNum
ORDER BY
id ASC
Row id city countryLabel latitude longitude
1 2 Taiwan 23.5000 121.0000
2 4 Hong Kong 22.2500 114.1667
3 5 Japan 35.6900 139.6900
4 6 Bangladesh 24.0000 90.0000
5 7 Heiden Germany 51.8333 6.9333
6 8 East Lansing United States 42.7283 -84.4882
7 9 Buncrana Ireland 55.1333 -7.4500
8 10 Collegeville United States 40.1879 -75.4254
9 11 Canada 60.0000 -95.0000
10 12 Calgary Canada 51.0833 -114.0833
11 13 Helsinki Finland 60.1756 24.9342
12 14 Bellmore United States 40.6726 -73.5364
13 15 Greece 39.0000 22.0000
14 16 Somerset United States 40.4976 -74.4885
15 17 United Kingdom 51.5000 -0.1300
#standardSQL
SELECT
id,
geoip.Country_Name AS country
FROM (
SELECT
id,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(ip)) AS ip,
TRUNC(NET.IPV4_TO_INT64(NET.IP_FROM_STRING(ip))/(256*256*256)) AS ip_class_a
FROM
`<project>.<dataset>.log` ) AS main
LEFT OUTER JOIN
`fh-bigquery.geocode.geolite_country_bq` AS geoip
ON
main.ip_class_a = geoip.classA
AND main.ip BETWEEN geoip.From_IP_Code
AND geoip.To_IP_Code
ORDER BY
id
Row id country
1 1 NA
2 2 Taiwan
3 3 NA
4 4 Hong Kong
5 5 Japan
6 6 Bangladesh
7 7 Germany
8 8 United States
9 9 Ireland
10 10 United States
11 11 Canada
12 12 Canada
13 13 Finland
14 14 United States
15 15 Greece
SELECT
id,
geoip.Country_Name AS country
FROM (
SELECT
id,
INTEGER(PARSE_IP(ip)) AS ip,
INTEGER(PARSE_IP(ip)/(256*256*256)) AS ip_class_a
FROM
[<project>:<dataset>.log] ) AS main
JOIN EACH [fh-bigquery:geocode.geolite_country_bq] AS geoip
ON
main.ip_class_a = geoip.classA
WHERE
main.ip BETWEEN geoip.From_IP_Code
AND geoip.To_IP_Code
ORDER BY
id
Row id country
1 1 NA
2 2 Taiwan
3 3 NA
4 4 Hong Kong
5 5 Japan
6 6 Bangladesh
7 7 Germany
8 8 United States
9 9 Ireland
10 10 United States
11 11 Canada
12 12 Canada
13 13 Finland
14 14 United States
15 15 Greece
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment