Skip to content

Instantly share code, notes, and snippets.

@vmadman
Created April 27, 2013 06:59
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save vmadman/5472166 to your computer and use it in GitHub Desktop.
Save vmadman/5472166 to your computer and use it in GitHub Desktop.
An apache log format that allow access logs (but not error logs) to be output in JSON format. I found this here: http://untergeek.com/2012/10/11/getting-apache-to-output-json-for-logstash/ -- but modified it for my purposes a good bit.
# Access Logs
LogFormat "{ \
\"@vips\":[\"%v\"], \
\"@source\":\"%v%U%q\", \
\"@source_host\": \"%v\", \
\"@source_path\": \"%f\", \
\"@tags\":[\"Apache\",\"Access\"], \
\"@message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \
\"@fields\": { \
\"timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
\"clientip\": \"%a\", \
\"duration\": %D, \
\"status\": %>s, \
\"request\": \"%U%q\", \
\"urlpath\": \"%U\", \
\"urlquery\": \"%q\", \
\"method\": \"%m\", \
\"referer\": \"%{Referer}i\", \
\"user-agent\": \"%{User-agent}i\", \
\"bytes\": %B \
} \
}" ls_apache_json
# The catch-all
CustomLog "||/usr/local/bin/udpclient.pl 127.0.0.1 5001" ls_apache_json
@pacohope
Copy link

pacohope commented Jun 7, 2019

Small correction, the sed customlog command needs to be spawned with a shell (|$ vs |) for the >> redirect to have meaning.

Seems to me that parts of the access_log contain user input from untrusted sources on the Internet. Is it really safe to pipe that through sed on the web server? The sed command will be running as the web server user and this might create opportunities for command injection. Likewise that's like one sed invocation per web request. Surely this will perform badly on a highly loaded server. Apache goes to great lengths to be scalable, but if we invoke a whole unix process on each and every log line, I think that would significantly hurt scalability.

@jonjensen
Copy link

Likewise that's like one sed invocation per web request. Surely this will perform badly on a highly loaded server.

@pacohope Luckily Apache 2.4 (and < 2.4 when using the || form) starts the filtering process once at Apache startup time, and pushes data through that one always-running process, so no, it doesn't add much overhead assuming the filter program is efficient, and it doesn't respawn anew for each request:

https://httpd.apache.org/docs/2.4/logs.html#piped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment