Skip to content

Instantly share code, notes, and snippets.

@colrichie
Last active February 23, 2020 01:56
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save colrichie/7254345 to your computer and use it in GitHub Desktop.
Save colrichie/7254345 to your computer and use it in GitHub Desktop.
APALOGNORM is a normalizer for logs of the Apache combined format.For example, the following command give you the referer field.$ apalognorm http-access.log | awk '{print $8}'
#! /bin/sh
######################################################################
#
# APALOGNORM : a normalizer for logs of the Apache combined format
#
# Written by Rich Mikan (richmikan[at]richlab.org) at 2014/01/04
#
# Usage : apalognorm [-s string] <logfile>
# <logfile> should be written with Apache combine format.
# -s Set the substitute string you want to convert from
# the space character. (default is "_")
#
# * This is a filter convert every space character to the other one.
# It is very useful from the various Unix command to treat.
# * e.g. the following command give you the referer field.
# $ apalognorm http-access.log | awk '{print $8}'
#
######################################################################
# definition: print the usage and exit
print_usage_and_exit () {
cat <<-__USAGE 1>&2
Usage : ${0##*/} [-s string] <logfile>
<logfile> should be written with Apache combine format.
-s Set the alternative character you want convert from
the space character. (default is "_")
Version : Mon May 5 11:28:22 JST 2014
__USAGE
exit 1
}
# parse the arguments
s_opt='_'
file=''
i=0
optmode=''
for arg in "$@"; do
i=$((i+1))
if [ -z "$optmode" ]; then
case "$arg" in
-s*)
ret=$(echo "_${arg#-s}" | sed '1s/^_//')
if [ -n "$ret" ]; then
s_opt=$ret
else
optmode='s'
fi
;;
*)
if [ -z "$file" ]; then
[ $i -eq $# ] || print_usage_and_exit
file=$arg
else
print_usage_and_exit
fi
;;
esac
elif [ "$optmode" = 's' ]; then
s_opt=$arg
optmode=''
else
print_usage_and_exit
fi
done
if [ \( ! -f "$file" \) -a \
\( ! -c "$file" \) -a \
\( ! -p "$file" \) -a \
\( "_$file" != '_-' \) -a \
\( ! -z "$file" \) ]; then
echo "${0##*/}: No such file found" 1>&2
print_usage_and_exit
elif [ -z "$file" ]; then
file='-'
fi
# convert the substitute string for setting to the sed command
sub=$(echo "_$s_opt" |
sed '1s/^_//' |
sed 's/\([\&/]\)/\\\1/g' )
# Define some marks for converting
RS=$(printf '\036') # a mark for the real new lines
LF=$(printf '\\\n_');LF=${LF%_} # LF for the sed command
# Convert the file
if [ "_$file" = '_-' ]; then #
sed 's/^\(.*\)$/\1'"$RS"'/' #
else #
sed 's/^\(.*\)$/\1'"$RS"'/' "$file" #
fi |
sed 's/"\([^"]*\)"/'"$LF"'"\1"'"$LF"'/g' |
sed 's/\[\([^]]*\)\]/'"$LF"'[\1]'"$LF"'/g' |
sed '/^["[]/s/[[:blank:]]/'"$sub"'/g' |
tr -d '\n' |
tr "$RS" '\n'
@colrichie
Copy link
Author

2014/05/05: fixed a serious bug
2014/01/04: support the pseudo file "-"
2013/11/03: improved (processing more quickly)
2013/11/01: first release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment