Skip to content

Instantly share code, notes, and snippets.

@mmacedo
Last active December 31, 2015 19:48
Show Gist options
  • Save mmacedo/8035542 to your computer and use it in GitHub Desktop.
Save mmacedo/8035542 to your computer and use it in GitHub Desktop.
Script I was using to test regex to match urls
#!/usr/bin/env ruby
require 'awesome_print'
r = %r{ \A
(?:(?<protocol>http)://)?
(?<domain>(?:[a-z\d-]+\.)+[a-z\d]{2,})
(?:
/
(?<path>(?:[a-z\d._\-+,%&()!'~&=:]+\/)*[a-z\d._\-+,%&()!'~&=:]+)
)?
/?
(?:
\?
(?:
(?<query>
(?:[^=?#&]+(?:=[^#&]*)?&)*
[^=?#&]+(?:=[^#&]*)?
)
&?
)?
)?
(?:
\#
(?<fragment>
[^=?#&]*
)?
)?
\z
}ix
path = ARGV[0] || "urls.txt"
begin STDERR.puts "File not found."; exit 1 end unless File.exists? path
lines = File.readlines(path).map(&:chomp).map(&:strip).reject(&:empty?).reject { |line| line.start_with? '#' }
x = lines.map { |s| s.match(r) || "No match for '#{s}'" }
x = x.map do |m|
if m.is_a? String
m
else
{
original: m[0],
domain: " " + m[:domain],
path: (" " + m[:domain].gsub(/./, ' ') + " " + (m[:path]||"")).rstrip,
query: (" " + m[:domain].gsub(/./, ' ') + " " + (m[:path]||"").gsub(/./, ' ') + " " + (m[:query]||"")).rstrip,
fragment: (" " + m[:domain].gsub(/./, ' ') + " " + (m[:path]||"").gsub(/./, ' ') + " " + (m[:query]||"").gsub(/./, ' ') + " " + (m[:fragment]||"")).rstrip
}
end
end
# ap x.select {|s| s.is_a? String } # Errors only
ap x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment