Skip to content

Instantly share code, notes, and snippets.

@rymcol
Created May 20, 2016 22:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rymcol/a14e0585977a75f3d37a3c143734c542 to your computer and use it in GitHub Desktop.
Save rymcol/a14e0585977a75f3d37a3c143734c542 to your computer and use it in GitHub Desktop.
import urllib2 as ur
import re
f = ur.urlopen(u'http://website.com/SiteMap.xml')
res = f.readlines()
for d in res:
data = re.findall('<loc>(http:\/\/.+)<\/loc>',d)
for i in data:
print i
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment