Skip to content

Instantly share code, notes, and snippets.

@tclancy
Created September 24, 2014 17:55
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tclancy/e0e71866b731b06b87bf to your computer and use it in GitHub Desktop.
Save tclancy/e0e71866b731b06b87bf to your computer and use it in GitHub Desktop.
Storing regular expression for parsing our Nginx log format after a bunch of iterations
nginx_line = r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})?(?P<ip2>, \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})?-? - ?\S* \[(?P<timestamp>\d{2}\/\w{3}\/\d{4}:\d{2}:\d{2}:\d{2} (\+|\-)\d{4})\]\s+\"(?P<method>\S{3,10}) (?P<path>\S+) HTTP\/1\.\d" (?P<response_status>\d{3}) (?P<bytes>\d+) "(?P<referer>(\-)|(.+))?" "(?P<useragent>.+)'
"""
Matches request lines from Nginx log format
'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$request_time" "$upstream_response_time" "$http_x_forwarded_for" '
'"$http_client_ip" "$http_x_real_ip"';
"""
@tclancy
Copy link
Author

tclancy commented Sep 24, 2014

Do note the user agent bit at the end actually captures everything to the end of the line because I did not care about the UAs and wanted to be liberal in what I matched. Look for a quote to end the UA if you need that info.

@tclancy
Copy link
Author

tclancy commented Sep 24, 2014

Doesn't match quit requests, but I don't care.

- - - [12/Aug/2014:13:03:16 +0000]  "quit" 400 172 "-" "-" 0.130 - .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment