Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dgurkaynak/8a98753f3e77e052fa53cae62671df83 to your computer and use it in GitHub Desktop.
Save dgurkaynak/8a98753f3e77e052fa53cae62671df83 to your computer and use it in GitHub Desktop.

Nginx weekly reporting w/ GoAccess

Adding hostname to nginx logs

By default nginx's log_format does not contain hostname. Since I serve multiple domains with a single nginx, I want to add hostname to nginx's access logs.

I guess nginx's default log_format is combined which is something like this:

log_format combined '$remote_addr - $remote_user [$time_local] '
    '"$request" $status $body_bytes_sent '
    '"$http_referer" "$http_user_agent"';

Our new log_format is named combined_with_host, defined as:

log_format combined_with_host '$remote_addr - $remote_user [$time_local] '
    '"$request_method $scheme://$host$request_uri $server_protocol" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent"';

Go to your nginx.conf, and add the lines above into http { ... } block.

Now we want to set format of access_logs to combined_with_host:

access_log /var/log/nginx/access.log combined_with_host;

Save your nginx.conf and reload the nginx.

Add cronjob

Switch to root user, and add following cron job by crontab -e:

0 0 * * 1 /bin/bash -c "/some/path/to/report.sh" >> /some/path/to/report.log

Serve the reports from nginx w/ basic auth

server {
        # ...

        location /nginx-weekly-reports {
                alias /some/path/to/reports;
                auth_basic "Restricted Content";
                auth_basic_user_file /some/path/to/.htpasswd;
        }
}
http {
##
# Logging Settings
##
# Add host to urls
#
# The default one I guess following:
# log_format combined '$remote_addr - $remote_user [$time_local] '
# '"$request" $status $body_bytes_sent '
# '"$http_referer" "$http_user_agent"';
#
# This one is the one I want:
# https://stackoverflow.com/a/37877244
# You have to adhere combined format, because these logs is parsed by goaccess
# From stackoverflow post, I remove the timing stuff at the end
log_format combined_with_host '$remote_addr - $remote_user [$time_local] '
'"$request_method $scheme://$host$request_uri $server_protocol" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent"';
# We need to use our newly defined log format
access_log /var/log/nginx/access.log combined_with_host;
error_log /var/log/nginx/error.log;
}
#!/usr/bin/env bash
set -e
# Check if running as root
if [ "$EUID" -ne 0 ]
then echo "Please run as root"
exit
fi
dates=(
"$(date '+%d\/%b\/%Y' -d '1 day ago')"
"$(date '+%d\/%b\/%Y' -d '2 days ago')"
"$(date '+%d\/%b\/%Y' -d '3 days ago')"
"$(date '+%d\/%b\/%Y' -d '4 days ago')"
"$(date '+%d\/%b\/%Y' -d '5 days ago')"
"$(date '+%d\/%b\/%Y' -d '6 days ago')"
"$(date '+%d\/%b\/%Y' -d '7 days ago')"
)
# join string
dates_regex=$(IFS="|" ; echo "${dates[*]}")
echo "==> Reporting process started on $(date)"
report_folder="/some/path/to/reports"
backup_file_path="$report_folder/$(date '+%Y-%m-%d' -d '1 day ago').html"
index_file_path="$report_folder/index.html"
# When piping data into goaccess you need to use `-`
# https://stackoverflow.com/a/64016265
zcat -f /var/log/nginx/access.log* |
grep -E $dates_regex |
goaccess - --log-format=COMBINED -a -o $backup_file_path
# Override index.html
rm -f $index_file_path
cp $backup_file_path $index_file_path
echo "Finished at $(date)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment