Skip to content

Instantly share code, notes, and snippets.

@ildar-shaimordanov
Last active August 6, 2023 18:55
Show Gist options
  • Save ildar-shaimordanov/a4023f98a36d161ad091aee1601934bd to your computer and use it in GitHub Desktop.
Save ildar-shaimordanov/a4023f98a36d161ad091aee1601934bd to your computer and use it in GitHub Desktop.
Print last starting entries found in all available log files

Print last starting entries found in all available log files

There are two kinds of magic: black and white. Take one spell:

  • Gandalf's formulae use white magic of awk or perl
  • Saruman's formulae use black magic trickily blending grep, sed, awk, tac and sort

Description

  1. process files FILE_PATTERN
  2. select the lines matching START_PATTERN
  3. from found in 2 take the last entries per every file
  4. from found in 3 take the entries for DAYS_AGO and later
  5. print everything found in 4

Usage

  1. put it somewhere in the system
  2. adjust it according your requirements (at least, set DAYS_AGO, and possibly FILE_PATTERN)
  3. set the execution privilege
  4. run it

The test log files are provided.

#!/bin/sh
# =========================================================================
DAYS_AGO=1
FILE_PATTERN='*.log'
START_PATTERN='===='
# =========================================================================
main() {
# workaround for BusyBox not supporting -d 'N days ago' format
CHECK_DATE="$(
ts="$(( ($( date +%s ) / 86400 - DAYS_AGO) * 86400 ))"
date '+%Y%m%d' -d "@$ts"
)"
# shellcheck disable=SC2086
print_last_start_entry $FILE_PATTERN
}
# =========================================================================
print_last_start_entry() {
awk -v start_pattern="$START_PATTERN" -v check_date="$CHECK_DATE" '
BEGINFILE {
last_start_entry = "";
}
$0 ~ start_pattern {
last_start_entry = $0;
}
ENDFILE {
if ( check_last_start_entry() ) print FILENAME, last_start_entry;
}
function check_last_start_entry() {
if ( last_start_entry == "" ) return false;
match(last_start_entry, /([0-9]+)[\/-]([0-9]+)[\/-]([0-9]+)/, res);
entry_date = res[3]res[1]res[2];
return entry_date >= check_date;
}' "$@"
}
# =========================================================================
main "$@"
# =========================================================================
# EOF
#!/usr/bin/env perl
=pod
=head1 SYNOPSIS
print_last_start_entry --help
print_last_start_entry [OPTIONS] [FILES]
=head1 DESCRIPTION
This is a C<grep>-like script to find and display last starting entry per
every log file.
By default, it passes through log files and looks for the strings
containing C<Starting> or C<Started> and prints the last one that has
happened since yesterday.
It's assumed that each log entry starts with a date of one of the
following formats: C<MM-DD-YYYY> or C<MM/DD/YYYY>.
=head1 OPTIONS
=over
=item B<--help>
Print this help and exit.
=item B<-d> C<DAYS>, B<--days-ago>=C<DAYS>
Print the latest starting entry for C<DAYS> days ago and later.
=item B<-D> C<DATE>, B<--since-date>=C<DATE>
Print the latest starting entries since C<DATE>. The C<DATE> can be one
of C<YYYYMMDD>, C<YYYY-MM-DD> or C<YYYY/MM/DD>.
=item B<-e> C<REGEXP>, B<--expr>=C<REGEXP>
Pattern to match the starting entry. The default value is C<Starting> or
C<Started>.
=item B<-i>, B<--ignore-case>
Case insensitive search.
=item B<-w>, B<--word-regexp>
Match whole words only.
=back
=cut
# =========================================================================
use strict;
use warnings;
no warnings "utf8";
use open qw( :std :utf8 );
use Pod::Usage;
use Getopt::Long qw( :config no_ignore_case bundling auto_version );
my $ONE_DAY = 24 * 60 * 60;
my $days_ago = 1;
my $check_date;
my $pattern = "Start(ing|ed)";
my $ignore_case;
my $word_regexp;
my $start_pattern;
# =========================================================================
pod2usage unless @ARGV and GetOptions(
"help" => sub { pod2usage({ -verbose => 2, -noperldoc => 1 }); },
"d|days-ago=i" => \$days_ago,
"D|since-date=s" => sub {
$_[1] =~ m#^ (\d{4}) ([-/]?) (\d\d) \2 (\d\d) $#x
or die "Unable to recognize date: $_[1]\n";
$check_date = "$1$3$4";
},
"e|regexp=s" => \$pattern,
"i|ignore-case" => \$ignore_case,
"w|word-regexp" => \$word_regexp,
);
# =========================================================================
$check_date or $check_date = do {
use integer;
$days_ago > 0 or die "Positive integer required: $days_ago\n";
my $t = (time / $ONE_DAY - $days_ago) * $ONE_DAY;
my @t = localtime($t);
sprintf "%04d%02d%02d", $t[5] + 1900, $t[4] + 1, $t[3];
};
$start_pattern = do {
my $re = $pattern;
$re = "\\b$re\\b" if $word_regexp;
eval { $ignore_case ? qr/$re/i : qr/$re/ }
or die "Bad regexp: $pattern:\n";
};
# =========================================================================
sub print_last_start_entry {
my $filename = $_[0];
open my $FILE, $filename or do {
warn "Unable to read the file: $filename: $!\n";
return;
};
my $last_start_entry;
while ( <$FILE> ) {
next unless m#$start_pattern#;
$last_start_entry = $_;
}
close $FILE;
return unless $last_start_entry;
$last_start_entry =~ m#^(\d+)[/-](\d+)[/-](\d+)#;
my $entry_date = "$3$1$2";
return unless $entry_date ge $check_date;
print "$filename $last_start_entry";
}
# =========================================================================
print_last_start_entry($_) for @ARGV;
# =========================================================================
# EOF
#!/bin/sh
# =========================================================================
DAYS_AGO=1
FILE_PATTERN='*.log'
START_PATTERN='===='
# =========================================================================
# workaround for BusyBox not supporting -d 'N days ago' format
CHECK_DATE="$(
ts="$(( ($( date +%s ) / 86400 - DAYS_AGO) * 86400 ))"
date '+%Y%m%d' -d "@$ts"
)"
# shellcheck disable=SC2086
grep -H -e "$START_PATTERN" $FILE_PATTERN \
| sed -r 's/:(([0-9]+)[/-]([0-9]+)[/-]([0-9]+))/\n\4\2\3\n\1/; s/$/\n/' \
| awk -v RS="\n\n" -v FS="\n" -v check_date="$CHECK_DATE" \
'$2 >= check_date { s[$1] = $3 } END { for (p in s) print p, s[p] }' \
| sort
# =========================================================================
# EOF
#!/bin/sh
# =========================================================================
DAYS_AGO=1
FILE_PATTERN='*.log'
START_PATTERN='===='
# =========================================================================
# workaround for BusyBox not supporting -d 'N days ago' format
CHECK_DATE="$(
ts="$(( ($( date +%s ) / 86400 - DAYS_AGO) * 86400 ))"
date '+%Y%m%d' -d "@$ts"
)"
# found here: https://stackoverflow.com/a/60019351/3627676
# shellcheck disable=SC2086
grep -H -e "$START_PATTERN" $FILE_PATTERN \
| tac \
| sort -u -t: -k1,1 \
| awk -F: -v check_date="$CHECK_DATE" \
'match($2, /([0-9]+)[/-]([0-9]+)[/-]([0-9]+)/, r) && r[3]r[1]r[2] >= check_date'
# =========================================================================
# EOF
07/18/2023 01:00:01.231223 CST do something
07/18/2023 01:10:01.231223 CST do something
07/18/2023 01:25:39.537704 CST ==== test-1 Starting ====
07/18/2023 02:00:01.231223 CST do something
07/11/2023 16:00:01.231223 CST do something
07/11/2023 17:00:01.231223 CST do something
07/11/2023 17:10:01.231223 CST do something
07/11/2023 17:25:49.685961 CST ==== test-2 Starting ====
07/11/2023 18:00:01.231223 CST do something
07/11/2023 19:00:01.231223 CST do something
07/11/2023 19:10:01.231223 CST do something
07/12/2023 11:41:57.476841 CST ==== test-2 Starting ====
07/12/2023 11:50:01.231223 CST do something
07/17/2023 14:00:01.231223 CST do something
07/17/2023 14:10:01.231223 CST do something
07/17/2023 15:25:39.537704 CST ==== test-2 Starting ====
07/17/2023 16:00:01.231223 CST do something
07/18/2023 17:00:01.231223 CST do something
07/18/2023 19:10:01.231223 CST do something
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment