Skip to content

Instantly share code, notes, and snippets.

@dirkgr
Created May 15, 2015 20:52
Show Gist options
  • Save dirkgr/77a6b8ecf76fed75a6a3 to your computer and use it in GitHub Desktop.
Save dirkgr/77a6b8ecf76fed75a6a3 to your computer and use it in GitHub Desktop.
Parse queries out of query logs
import re
import sys
import urllib
for line in sys.stdin:
m = re.search("""GET /search?\S*q=([^& ]*)""", line)
if m:
s = urllib.unquote_plus(m.groups(1)[0])
try:
print s.decode('utf-8')
except:
try:
print s.decode('iso-8859-1')
except:
pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment