Skip to content

Instantly share code, notes, and snippets.

@ajinabraham
Forked from uogbuji/gruber_urlintext.py
Last active September 21, 2015 15:23
Show Gist options
  • Save ajinabraham/c8e6d0393a153385724d to your computer and use it in GitHub Desktop.
Save ajinabraham/c8e6d0393a153385724d to your computer and use it in GitHub Desktop.
import re, urllib
PAT = re.compile(ur'(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?\xab\xbb\u201c\u201d\u2018\u2019]))')
for line in urllib.urlopen("http://daringfireball.net/misc/2010/07/url-matching-regex-test-data.text"):
print [ mgroups[0] for mgroups in PAT.findall(line) ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment