This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+-----+-----------+--------+-----------+--------+----------+--------+-----------+ | |
| asn| cidr| ip_max| ip_max_str| ip_min|ip_min_str| network|network_str| | |
+-----+-----------+--------+-----------+--------+----------+--------+-----------+ | |
|56203| 1.0.4.0/24|16778495| 1.0.4.255|16778240| 1.0.4.0|16778240| 1.0.4.0| | |
|56203| 1.0.5.0/24|16778751| 1.0.5.255|16778496| 1.0.5.0|16778496| 1.0.5.0| | |
|56203| 1.0.6.0/24|16779007| 1.0.6.255|16778752| 1.0.6.0|16778752| 1.0.6.0| | |
|56203| 1.0.7.0/24|16779263| 1.0.7.255|16779008| 1.0.7.0|16779008| 1.0.7.0| | |
| 2519|1.0.20.0/23|16782847| 1.0.21.255|16782336| 1.0.20.0|16782336| 1.0.20.0| | |
| 2519|1.0.20.0/23|16782847| 1.0.21.255|16782336| 1.0.20.0|16782592| 1.0.21.0| | |
| 2519|1.0.22.0/23|16783359| 1.0.23.255|16782848| 1.0.22.0|16782848| 1.0.22.0| |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
spark.sql(''' | |
SELECT ips.ip, ips.ip_int, asns.asn | |
FROM ips inner join asns | |
ON ips.ip_network = asns.network | |
WHERE ips.ip_int > asns.ip_min and ips.ip_int < asns.ip_max | |
''').count() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def get_ip_network(ip): | |
# ip as integer not dotted quad str | |
mask = 4294967040 #int(ipaddress.ip_address(u'255.255.255.0')) | |
return ip & mask |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+-----+------------+--------+-----------+--------+----------+ | |
| asn| cidr| ip_max| ip_max_str| ip_min|ip_min_str| | |
+-----+------------+--------+-----------+--------+----------+ | |
|15169| 1.0.0.0/24|16777471| 1.0.0.255|16777216| 1.0.0.0| | |
|56203| 1.0.4.0/24|16778495| 1.0.4.255|16778240| 1.0.4.0| | |
|56203| 1.0.5.0/24|16778751| 1.0.5.255|16778496| 1.0.5.0| | |
|56203| 1.0.6.0/24|16779007| 1.0.6.255|16778752| 1.0.6.0| | |
|56203| 1.0.7.0/24|16779263| 1.0.7.255|16779008| 1.0.7.0| | |
| 2519| 1.0.20.0/23|16782847| 1.0.21.255|16782336| 1.0.20.0| | |
| 2519| 1.0.22.0/23|16783359| 1.0.23.255|16782848| 1.0.22.0| |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyspark.sql.functions import udf, explode | |
from pyspark.sql.types import ArrayType, StringType | |
def get_networks(low, high): | |
return [str(ipaddress.ip_address(x)) | |
for x in range(int(ipaddress.ip_address(low)), | |
int(ipaddress.ip_address(high)), 256)] | |
get_networks_udf = udf(get_networks, ArrayType(StringType())) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In [1]: case class Circle(rad:Float) | |
In [2]: val rdd = sc.parallelize(1 to 10000).map(i=>Circle(i.toFloat)) | |
Out[2]: MappedRDD[1] at map at <console>:11 | |
In [3]: rdd.take(10) | |
14/11/11 13:03:35 ERROR TaskResultGetter: Exception while getting task result | |
com.esotericsoftware.kryo.KryoException: Unable to find class: [L$line5.$read$$iwC$$iwC$Circle; |