Skip to content

Instantly share code, notes, and snippets.

@olivergeorge
Created September 17, 2011 04:54
Show Gist options
  • Save olivergeorge/1223639 to your computer and use it in GitHub Desktop.
Save olivergeorge/1223639 to your computer and use it in GitHub Desktop.
Read search rank information for keywords in google referrer urls
from urlparse import urlparse
from cgi import parse_qs
import sys, re
RE_URL = re.compile('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')
for line in sys.stdin.readlines():
try:
match = re.search(RE_URL, line)
if not match:
continue
referrer = match.group(0)
url = urlparse(referrer)
if not url.netloc in ['www.google.com', 'www.google.com.au']:
continue
if not url.path == '/url':
continue
qs = parse_qs(url.query)
if not qs.has_key('q') or not qs.has_key('cd'):
continue
q = qs['q'].pop().lower()
cd = qs['cd'].pop()
print cd, q
except :
sys.stderr.write("Unable to parse %s" % line)
raise
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment