Skip to content

Instantly share code, notes, and snippets.

@yesnault
Created January 26, 2014 16:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save yesnault/8635156 to your computer and use it in GitHub Desktop.
Save yesnault/8635156 to your computer and use it in GitHub Desktop.
Logstash - Query Sincedb for Processed Files
#!/bin/bash
#
#
# Usage
# This script do nothing, it juste show files that are scanned or not by
# Logstash and plugin File. 3 outputs
#
# ./utilSinceDB.sh
# ./utilSinceDB.sh | grep "not found in sincedb"
# ./utilSinceDB.sh | grep "to delete"
# ./utilSinceDB.sh | grep "to NOT delete"
#
# See CONFIGURATION below to adjust with your configuration of logstash
#
# file not yes parsed by Logstash.
# example of output :
# /var/log//com.apple.launchd/launchd-shutdown.system.log 61200886 not found in sincedb file
#
# file already parsed by Logstash, but not yet fully scanned :
# example of output :
# /var/log//appstore.log 55517923 found in sincedb file actual_size:13629291 scanned_size:16141036
#
# file already parsed by Logstash, and fully scanned. The file was updated since $DAY_TO_KEEP_FILE_SCANNED days :
# example of output :
# /var/log//appstore.log 55517923 found in sincedb file actual_size:13629291 scanned_size:13629291 [fully scanned since 201401261538] to NOT delete
#
# file already parsed by Logstash, and fully scanned. The file was not updated since $DAY_TO_KEEP_FILE_SCANNED days :
# example of output :
# /var/log//hdiejectd.log 58505230 found in sincedb file actual_size:1147 scanned_size:1147 [fully scanned since 201401192118] to delete
#
## ----------------------------------------------
## CONFIGURATION
## ----------------------------------------------
# Path of the logs files scanned by Logstash
PATH_FILES="/var/log/"
# Pattern of the logs files scanned by Logstash
FILE_PATTERN="*.log"
# Path where sincedb files are.
PATH_SINCE_DB=~/.sincedb*
# display "to delete" if the file is fully scanned
# and modified $DAY_TO_KEEP_FILE_SCANNED days ago
# disply "to NOT delete" if the file is fully scanned
# and not modified since $DAY_TO_KEEP_FILE_SCANNED days
DAY_TO_KEEP_FILE_SCANNED=1
## END OF CONFIGURATION
## ----------------------------------------------
MYDATE=`date +%Y%m%d%H%M`;
# redirect errors if find command has no permission to enter a sub-directory
for file in `find $PATH_FILES -name $FILE_PATTERN 2> /dev/null`; do
txt=""
inode=`ls -i $file | awk '{print $1}'`
txt="$txt $file $inode";
actual_size=`ls -l $file | awk '{print $5}'`
grep $inode $PATH_SINCE_DB > /dev/null 2>&1
if [ $? -eq 0 ]; then
txt="$txt found in sincedb file"
txt="$txt actual_size:$actual_size"
scanned_sizes=`grep $inode $PATH_SINCE_DB | awk '{print $4}'`
for scanned_size in $scanned_sizes; do
txtv="scanned_size:$scanned_size"
if [ $actual_size -eq $scanned_size ]; then
date_last_modification=`perl -MPOSIX=strftime -le 'print strftime("%Y%m%d%H%M", localtime((stat shift)[9]))' $file`
txtv="$txtv [fully scanned since $date_last_modification] "
delta=`expr $MYDATE - $date_last_modification`
dayInDate=`expr $DAY_TO_KEEP_FILE_SCANNED \* 10000`
#txtv="$txtv delta:$delta dayInDate:$dayInDate"
if [ $delta -gt $dayInDate ]; then
txtv="$txtv to delete"
else
txtv="$txtv to NOT delete"
fi
fi;
echo "$txt $txtv"
done;
else
txt="$txt not found in sincedb file"
echo $txt
fi;
done;
@yesnault
Copy link
Author

@JazzyJes
Copy link

Very helpful, thank you. I've ported your shell script into this Java spring integration filter: https://gist.github.com/jeschergui/7d43cd8f2b4aafb6f0fa
Cheers

@syunusic
Copy link

syunusic commented Dec 7, 2015

Excellent! Very useful. Thanks!

@praveenmak
Copy link

Can scanned_size be greater than actual_size ?
In other words , this will never be satisfied
if [ $actual_size -eq $scanned_size ];

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment