Skip to content

Instantly share code, notes, and snippets.

@stympy
Last active May 2, 2021
Embed
What would you like to do?
Ruby class to check URL validity
require "ipaddr"
require "resolv"
require "uri"
class UrlChecker
SCHEME_REGEX = Regexp.new(/\Ahttps?/)
HOST_REGEX = Regexp.new(/.+\..+/)
IPV4_REGEX = Regexp.new(/(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}/)
IPV6_REGEX = Regexp.new(/(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))/)
PRIVATE_RANGES = %w[127.0.0.0/8 192.168.0.0/16 172.16.0.0/12 10.0.0.0/8 100.64.0.0/10 fc00::/7 169.254.169.254].map { |r| IPAddr.new(r) }
def initialize(url)
@url = url
end
def uri
@uri ||= URI.parse(@url)
end
def is_ip?
return @is_ip unless @is_ip.nil?
@is_ip = uri.host.match?(IPV4_REGEX) || uri.host.match?(IPV6_REGEX)
end
def valid?
return false unless uri.scheme&.match?(SCHEME_REGEX)
return false unless uri.host&.match?(HOST_REGEX) || uri.host&.match?(IPV6_REGEX)
if is_ip?
ip = IPAddr.new(uri.host)
return false if PRIVATE_RANGES.any? { |r| r.include?(ip) }
end
true
rescue URI::InvalidURIError
false
end
def resolves?
return false unless valid?
ip = if is_ip?
IPAddr.new(uri.host)
else
IPAddr.new(Resolv.getaddress(uri.host))
end
PRIVATE_RANGES.none? { |r| r.include?(ip) }
rescue Resolv::ResolvError
false
end
end
@stympy

This comment has been minimized.

Copy link
Owner Author

@stympy stympy commented May 16, 2020

The goal with this class is to check that a given URL won't be inaccessible for public requests... that is, it rejects URLs like http://localhost, http://192.168.1.1, etc. It does not verify that a URL is reachable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment