Skip to content

Instantly share code, notes, and snippets.

@kch
Created November 29, 2009 16:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kch/244976 to your computer and use it in GitHub Desktop.
Save kch/244976 to your computer and use it in GitHub Desktop.
#!/opt/local/bin/ruby1.9
require 'yaml'
urls = YAML::load <<-STR
http://foo.com/blah_blah : http://foo.com/blah_blah
http://foo.com/blah_blah/ : http://foo.com/blah_blah/
(Something like http://foo.com/blah_blah) : http://foo.com/blah_blah
http://foo.com/blah_blah_(wikipedia) : http://foo.com/blah_blah_(wikipedia)
http://foo.com/blah-blah-(wiki-pedia) : http://foo.com/blah-blah-(wiki-pedia)
(Something like http://foo.com/blah_blah_(wikipedia)) : http://foo.com/blah_blah_(wikipedia)
(Something like http://foo.com/blah_-blah_(wikip-edia)) : http://foo.com/blah_-blah_(wikip-edia)
http://foo.com/blah_blah. : http://foo.com/blah_blah
http://foo.com/blah_blah/. : http://foo.com/blah_blah/
<http://foo.com/blah_blah> : http://foo.com/blah_blah
<http://foo.com/blah_blah/> : http://foo.com/blah_blah/
http://foo.com/blah_blah, : http://foo.com/blah_blah
http://www.example.com/wpstyle/?p=364. : http://www.example.com/wpstyle/?p=364
http://www.example.com/wpstyle/?p=364;asdf=123&qewr=78. : http://www.example.com/wpstyle/?p=364;asdf=123&qewr=78
http://✪df.ws/123 : http://✪df.ws/123
rdar://1234 : rdar://1234
rdar:/1234 : rdar:/1234
http://userid:password@example.com:8080 : http://userid:password@example.com:8080
http://userid@example.com : http://userid@example.com
http://userid@example.com:8080 : http://userid@example.com:8080
http://userid:password@example.com : http://userid:password@example.com
http://example.com:8080 : http://example.com:8080
http://exa-mple.com:8080 : http://exa-mple.com:8080
x-yojimbo-item://6303E4C1-xxxx-45A6-AB9D-3A908F59AE0E : x-yojimbo-item://6303E4C1-xxxx-45A6-AB9D-3A908F59AE0E
message://%3c330e7f8409726r6a4ba78dkf1fd71420c1bf6ff@mail.gmail.com%3e : message://%3c330e7f8409726r6a4ba78dkf1fd71420c1bf6ff@mail.gmail.com%3e
http://➡.ws/䨹 : http://➡.ws/䨹
www.➡.ws/䨹 : www.➡.ws/䨹
<tag>http://example.com</tag> : http://example.com
Just a www.example.com link. : www.example.com
STR
RX = %r{\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))}
# RX = %r{\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w-]+\)|([^[:punct:]\s<>]|/)))}
# FIXES :: ^^^^^ ^^
y urls.entries.map { |k, v| k =~ RX; [v, $&] }.select { |(s, m)| s != m }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment