Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@sid321axn
Created August 18, 2020 16:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sid321axn/1008ca231fccd0c440efec84b2979c2b to your computer and use it in GitHub Desktop.
Save sid321axn/1008ca231fccd0c440efec84b2979c2b to your computer and use it in GitHub Desktop.
import re
#Use of IP or not in domain
def having_ip_address(url):
match = re.search(
'(([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.'
'([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\/)|' # IPv4
'((0x[0-9a-fA-F]{1,2})\\.(0x[0-9a-fA-F]{1,2})\\.(0x[0-9a-fA-F]{1,2})\\.(0x[0-9a-fA-F]{1,2})\\/)' # IPv4 in hexadecimal
'(?:[a-fA-F0-9]{1,4}:){7}[a-fA-F0-9]{1,4}', url) # Ipv6
if match:
# print match.group()
return 1
else:
# print 'No matching pattern found'
return 0
df['use_of_ip'] = df['url'].apply(lambda i: having_ip_address(i))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment