Skip to content

Instantly share code, notes, and snippets.

@markstgodard
Last active August 29, 2015 14:27
Show Gist options
  • Save markstgodard/18fcb762a60f7d2ea791 to your computer and use it in GitHub Desktop.
Save markstgodard/18fcb762a60f7d2ea791 to your computer and use it in GitHub Desktop.
See: https://github.com/cloudfoundry/gorouter/pull/92
@markstgodard
Copy link
Author

GoRouter Access Log Record format and parsability

See: cloudfoundry/gorouter#92

Taking example access log records with new optional headers (embedded JSON vs. existing format) and then using Logstash GROK patterns to parse elements. Also this example is just using Logstash as an example log parsing technology.

In this example: we want to also optionally log the following headers: Cache-Control, Span-Id, Trace-Id

  • Example access log record (w/ JSON)
api.10.244.0.34.xip.io - [19/08/2015:18:43:46 +0000] "GET /v2/syslog_drain_urls?batch_size=1000 HTTP/1.1" 200 0 42 "-" "Go 1.1 package http" 10.0.2.15:49617 x_forwarded_for:"10.0.2.15" vcap_request_id:7151bca9-dafa-4d8b-712e-693819bb3411 response_time:0.006796115 app_id: headers:{"Cache-Control":"private","Span-Id":"3888dcb6-34f4-11e5-a151-feff819cdc9f","Trace-Id":"7b28f886-fe5f-41af-8182-f88c8fec9e1c"}
  • Example Logstash grok pattern
%{HOSTNAME:requestHost} - %{SYSLOG5424SD:startedAt} "%{DATA:requestMethod} %{DATA:requestUri} %{DATA:requestProto}" %{NUMBER:statusCode} %{NUMBER:bytesReceived} %{NUMBER:bytesSent} %{QS:referrer} %{QS:userAgent} %{IP:remoteAddressIP}:%{NUMBER:remoteAddressPort} x_forwarded_for:"%{IP:x_forwarded_for}" vcap_request_id:%{DATA:vcap_request_id} response_time:%{DATA:response_time} app_id:%{DATA:app_id} headers:%{GREEDYDATA:headers}

Note: The above pattern was just hacked together, to get an idea of the tokens, etc. This is not an official GoRouter Access Log Logstash Grok pattern :)

  • Parsed elements:
  "requestHost":      "api.10.244.0.34.xip.io"
"startedAt":        "[19/08/2015:18:43:46 +0000]"
"requestMethod":    "GET"
"requestUri":       "/v2/syslog_drain_urls?batch_size=1000"
"requestProto":     "HTTP/1.1"
"statusCode":       "200"
"bytesReceived":    "0"
"bytesSent":        "42"
"referrer":         ""-""
"userAgent":        ""Go 1.1 package http""
"remoteAddressIP":  "10.0.2.15"
"remoteAddressPort": "49617"
"x_forwarded_for":  "10.0.2.15"
"vcap_request_id":  "7151bca9-dafa-4d8b-712e-693819bb3411"
"response_time":    "0.006796115"
"app_id":           ""
"headers":          "{"Cache-Control":"private","Span-Id":"3888dcb6-34f4-11e5-a151-feff819cdc9f","Trace-Id":"7b28f886-fe5f-41af-8182-f88c8fec9e1c"}"

In this case I just consumed the last element as GREEDYDATA pattern, since I didnt see any nice way of having an existing GROK pattern and then also parsing JSON. Logstash supports parsing JSON log statements, however I didnt see a way to cleanly combine GROK and JSON.
It is likely possible to just create a custom GROK / regex pattern to include, however this would add complexity.


  • Example access log record (no JSON)
api.10.244.0.34.xip.io - [19/08/2015:18:43:46 +0000] "GET /v2/syslog_drain_urls?batch_size=1000 HTTP/1.1" 200 0 42 "-" "Go 1.1 package http" 10.0.2.15:49617 x_forwarded_for:"10.0.2.15" vcap_request_id:7151bca9-dafa-4d8b-712e-693819bb3411 response_time:0.006796115 app_id: Cache-Control:"private" Span-Id:"3888dcb6-34f4-11e5-a151-feff819cdc9f" Trace-Id:"7b28f886-fe5f-41af-8182-f88c8fec9e1c"
  • Example Logstash grok pattern
%{HOSTNAME:requestHost} - %{SYSLOG5424SD:startedAt} "%{DATA:requestMethod} %{DATA:requestUri} %{DATA:requestProto}" %{NUMBER:statusCode} %{NUMBER:bytesReceived} %{NUMBER:bytesSent} %{QS:referrer} %{QS:userAgent} %{IP:remoteAddressIP}:%{NUMBER:remoteAddressPort} x_forwarded_for:"%{IP:x_forwarded_for}" vcap_request_id:%{DATA:vcap_request_id} response_time:%{DATA:response_time} app_id:%{DATA:app_id} Cache-Control:%{QS:cacheControl} Span-Id:%{QS:spanId} Trace-Id:%{QS:traceId}

Note: The above can continue to parse the optional header elements included at the END of the existing access log record

  • Parsed elements:
  "requestHost":      "api.10.244.0.34.xip.io"
"startedAt":        "[19/08/2015:18:43:46 +0000]"
"requestMethod":    "GET"
"requestUri":       "/v2/syslog_drain_urls?batch_size=1000"
"requestProto":     "HTTP/1.1"
"statusCode":       "200"
"bytesReceived":    "0"
"bytesSent":        "42"
"referrer":         ""-""
"userAgent":        ""Go 1.1 package http""
"remoteAddressIP":  "10.0.2.15"
"remoteAddressPort": "49617"
"x_forwarded_for":  "10.0.2.15"
"vcap_request_id":  "7151bca9-dafa-4d8b-712e-693819bb3411"
"response_time":    "0.006796115"
"app_id":           ""
"Cache-Control":    "private"
"Span-Id":          "3888dcb6-34f4-11e5-a151-feff819cdc9f"
"Trace-Id":         "7b28f886-fe5f-41af-8182-f88c8fec9e1c"

In this case where the access log format uses key:"value(s)" for the optional headers, it would be simpler to parse and would keep things consistent.

@shalako
Copy link

shalako commented Aug 21, 2015

Thank you for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment