Skip to content

Instantly share code, notes, and snippets.

@j05h
Created July 9, 2010 16:27
Show Gist options
  • Save j05h/469681 to your computer and use it in GitHub Desktop.
Save j05h/469681 to your computer and use it in GitHub Desktop.
awk script to parse access logs
awk '{ sub(/([0-9]+\.?)+/, ":id", $3);
sub(/[0-9]+/, ":id", $3);
sub(/=.*&/, "=:param\\&", $3);
sub(/=.*$/, "=:param", $3);
sub(/\?[:a-z0-9]+$/, "", $3);
print $1, $2, $3, $4
}' | sort | uniq -c | sort
@j05h
Copy link
Author

j05h commented Jul 9, 2010

An awk script to parse access logs and strip out ids and parameters to make URLs generic and sortable, uniqueable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment