Skip to content

Instantly share code, notes, and snippets.

@streeter
Forked from simonw/heroku-nginx-logging.md
Created February 9, 2016 14:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save streeter/57ae30a7c41bac3b1972 to your computer and use it in GitHub Desktop.
Save streeter/57ae30a7c41bac3b1972 to your computer and use it in GitHub Desktop.
How to get Heroku to log full user_agent strings using nginx

How to get Heroku to log full user_agent strings using nginx

Heroku's default logging format omits user agent and referrer strings - but these are useful to have! Especially if you want to be able to run your own analysis on what kind of browsers are using your service.

A default Heroku log line looks like this:

heroku/router:  at=info method=GET path="/page-on-your-site/"
    host=your-site.herokuapp.com
    request_id=76fe0891-38cd-4293-a082-ac075d744197
    fwd="10.10.10.10" dyno=web.1 connect=1ms service=961ms
    status=200 bytes=1274

It isn't possible to change the format output by the Heroku router. But it IS possible to write your own lines to the heroku logs. You can do this from your own application, but it turns out there are some solid reasons to run your own nginx server inside your Heroku dyno (for buffering large requests). If you're doing that already, it's easy to configure that nginx instance to use whatever custom logging format you like.

If you use the Heroku nginx buildpack (or one of its forks), the buildpack will install a nginx configuration file in the following location:

config/nginx.conf.erb

This happens before your app is copied from your git checkout, which means you can replace the nginx config file with your own just by storing your own file in the same location within your application's repository.

This means you can customise the log_format line from that file. Here's the format I'm using, designed to use the semi-standard key="optionally quoted value" format used by other parts of the Heroku stack:

log_format custom 'measure#nginx.service=$request_time request="$request" '
    'status_code=$status request_id=$http_x_request_id '
    'remote_addr="$remote_addr" forwarded_for="$http_x_forwarded_for" '
    'forwarded_proto="$http_x_forwarded_proto" via="$http_via" '
    'body_bytes_sent=$body_bytes_sent referer="$http_referer" '
    'user_agent="$http_user_agent"';
access_log logs/nginx/access.log custom;

Just droy in a copy of the https://github.com/ryandotsmith/nginx-buildpack/blob/master/config/nginx.conf.erb file and make the above alteration to get completely custom logging in your Heroku logs, which looks something like thys:

Feb 05 13:31:55 possum-playground app/web.1:  measure#nginx.service=0.023
request="GET /blah/ HTTP/1.1" status_code=200 request_id=8ff12bfba-2136
-472b-8b73-e8b5bee4296d remote_addr="10.10.10.10" forwarded_for="1.1.1.1"
forwarded_proto="https" via="1.1 vegur" body_bytes_sent=919 referer="-"
user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"

What about escaping special characters?

You might be questioning how reliable this key="quoted value" format style of logging actually is. What if a double quote shows up in the content of the value, for example in the user-agent string? Would that break a log parser that expects to be able to read these key value pairs?

It turns out nginx has you covered: it automatically escapes double quotes (and some other characters) when it interpolates values in the log_format string. Here's an example of a request I generated using a faked user-agent:

requests.get('http://example.com/test/', headers={
    'User-Agent': 'This user-agent has "double quotes" in it'
})

The resulting nginx logline looked like this:

Feb 05 13:25:22 my-heroku-app app/web.1:  measure#nginx.service=0.009
request="GET /test/ HTTP/1.1" status_code=404 request_id=8433b02c-307c-
4d07-b2c6-8d5b37c3e82b remote_addr="10.0.0.0" forwarded_for="1.1.1.1"
forwarded_proto="http" via="1.1 vegur" body_bytes_sent=90 referer="-"
user_agent="This user-agent has \x22double quotes\x22 in it"

The double quotes were escaped as \x22. I also tested special characters in the path and they get the same treatment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment