Skip to content

Instantly share code, notes, and snippets.

@chr15m
Last active May 13, 2022 23:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chr15m/7ec8988164ddc68dd624ec9c2d95c840 to your computer and use it in GitHub Desktop.
Save chr15m/7ec8988164ddc68dd624ec9c2d95c840 to your computer and use it in GitHub Desktop.
Extract only the latest log lines from apache/nginx logs
#!/usr/bin/awk -f
# Usage:
# zcat -f access.log.* | ./extract-latest-log-date.awk > latest-log.txt
BEGIN {
# pre-compute months field lookup
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
months[d[o]]=sprintf("%02d",o)
}
# date field separators
FS="[]]|[[]"
}
# for each line
{
# break the date field into components
split($2, dt, "[: /]")
# build a new comparable date sequence
datestamp = (dt[3] " " months[dt[2]] " " dt[1] " " dt[4] " " dt[5] " " dt[6])
# find if this line has a more recent date than the latest known one
if (datestamp > latest) {
latest = datestamp;
latest_stamp = $2;
}
}
# once we're done, output the result
END {
print latest_stamp
}
#!/usr/bin/awk -f
# Usage:
# (once you've got a date from parsing some previous logs above)
# zcat -f access.log.* | ./log-lines-after-date.awk -vcheck="`cat latest-log.txt`"
BEGIN {
if (check == "" ) {
print "Usage: log-lines-after-date.awk -vcheck='DATE'"
print "Where DATE is e.g. 04/Jun/2017:07:44:10"
exit 1
}
# pre-compute months field lookup
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
months[d[o]]=sprintf("%02d",o)
}
# pre-compute our check date in a comparable format
split(check, dti, "[: /]")
dcheck = (dti[3] " " months[dti[2]] " " dti[1] " " dti[4] " " dti[5] " " dti[6])
# date field separators
FS="[]]|[[]"
}
# for each line
{
# break the date field into components
split($2, dt, "[: /]")
# build a new comparable date sequence
datestamp = (dt[3] " " months[dt[2]] " " dt[1] " " dt[4] " " dt[5] " " dt[6])
# find if this line has a more recent date than the check date
if (datestamp > dcheck) {
print $0
}
}
@chr15m
Copy link
Author

chr15m commented Jun 22, 2017

If you have a bunch of log files (access.log.*) and you want to only process newly added/changed lines when they are updated you can use these scripts to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment