Skip to content

Instantly share code, notes, and snippets.

@janl
Created September 25, 2012 17:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save janl/3783205 to your computer and use it in GitHub Desktop.
Save janl/3783205 to your computer and use it in GitHub Desktop.
Bash script to parse Apache log for a count of RSS subscribers and email it to you
#!/bin/sh -e
# sh for portability, -e for halt on errors
# Schedule this to run once a day with cron. Doesn't matter what time since it parses yesterday's hits (by default).
# I only tested this on the Marco.org server, which runs CentOS (RHEL). No idea how it'll work on other distributions, but it's pretty basic.
# Required variables:
RSS_URI="/rss"
MAIL_TO="your@email.com"
LOG_FILE="/var/log/httpd/access_log"
# --- Optional customization ---
MAIL_SUBJECT="RSS feed subscribers"
# Date expression for yesterday
DATE="-1 day"
# Locale for printf number formatting (e.g. "10000" => "10,000")
LANG=en_US
# Date format in Apache log
LOG_FDATE=`date -d "$DATE" '+%d/%b/%Y'`
# Date format for display in emails
HUMAN_FDATE=`date -d "$DATE" '+%F'`
# --- The actual log parsing ---
# Unique IPs requesting RSS, except those reporting "subscribers":
IPSUBS=`fgrep "$LOG_FDATE" "$LOG_FILE" | fgrep " $RSS_URI" | egrep -v '[0-9]+ subscribers' | cut -d' ' -f 1 | sort | uniq | wc -l`
# Google Reader subscribers and other user-agents reporting "subscribers" and using the "feed-id" parameter for uniqueness:
GRSUBS=`fgrep "$LOG_FDATE" "$LOG_FILE" | fgrep " $RSS_URI" | egrep -o '[0-9]+ subscribers; feed-id=[0-9]+' | sort | uniq | cut -d' ' -f 1 | awk '{s+=$1} END {print s}'`
# Other user-agents reporting "subscribers", for which we'll use the entire user-agent string for uniqueness:
OTHERSUBS=`fgrep "$LOG_FDATE" "$LOG_FILE" | fgrep " $RSS_URI" | fgrep -v 'subscribers; feed-id=' | egrep '[0-9]+ subscribers' | egrep -o '"[^"]+"$' | sort | uniq | egrep -o '[0-9]+ subscribers' | awk '{s+=$1} END {print s}'`
REPORT=$(
printf "Feed stats for $HUMAN_FDATE:\n\n"
printf "%'8d Google Reader subscribers\n" $GRSUBS
printf "%'8d subscribers from other aggregators\n" $OTHERSUBS
printf "%'8d direct subscribers\n" $IPSUBS
echo "--------"
printf "%'8d total subscribers\n" `expr $GRSUBS + $OTHERSUBS + $IPSUBS`
)
echo "$REPORT"
echo ""
echo "Also emailed to $MAIL_TO."
echo "$REPORT " | mail -s "[$HUMAN_FDATE] $MAIL_SUBJECT" "$MAIL_TO"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment