Skip to content

Instantly share code, notes, and snippets.

@jaygooby
Last active November 22, 2022 14:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jaygooby/6b57ad9d28b91c7d7faef3636d6ae2f1 to your computer and use it in GitHub Desktop.
Save jaygooby/6b57ad9d28b91c7d7faef3636d6ae2f1 to your computer and use it in GitHub Desktop.
Find interesting referers in access.log - see https://jay.gooby.org/2021/11/30/find-interesting-referers-in-access-log for more details
#!/bin/bash
#
# MIT License
#
# Copyright (c) 2021 Jay Caines-Gooby, @jaygooby, jay@gooby.org
#
# The one-liner from https://jay.gooby.org/2021/11/30/find-interesting-referers-in-access-log
#
# Usage: interesting-referers /path/to/access.log
# interesting-referers /path/to/access.log example.com
#
# The first usage example defaults to filtering out any
# referer lines with your host in, the second example
# lets you override that host and provide another to ignore
#
# Edit the jay.gooby.org below to reflect your own web host
log="$1"
host=${2:-jay.gooby.org}
# Leave the rest as is
[ -z "$log" ] && echo "Usage: $(basename "$0") /path/to/access.log [hostname]" >&2 && exit 1
zcat -f "$log" | awk -v host=$host '{if ($7 !~ /\.[^html]/ && $11 != "\"-\"" && $11 !~ host) {print $7 "\t" $11}}' | sort | uniq -c | sort
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment