Skip to content

Instantly share code, notes, and snippets.

@ace-subido
Last active April 13, 2022 00:41
Show Gist options
  • Save ace-subido/5a80c3fa8bd4f4973f67 to your computer and use it in GitHub Desktop.
Save ace-subido/5a80c3fa8bd4f4973f67 to your computer and use it in GitHub Desktop.
Script to list/delete old files in an HDFS Directory
#!/bin/bash
usage="Usage: ./list-old-hdfs-files.sh [path] [days]"
if [ ! "$1" ]
then
echo $usage;
exit 1;
fi
if [ ! "$2" ]
then
echo $usage;
exit 1;
fi
now=$(date +%s);
# Loop through files
sudo -u hdfs hdfs dfs -ls $1 | while read f; do
# Get File Date and File Name
file_date=`echo $f | awk '{print $6}'`;
file_name=`echo $f | awk '{print $8}'`;
# Calculate Days Difference
difference=$(( ($now - $(date -d "$file_date" +%s)) / (24 * 60 * 60) ));
if [ $difference -gt $2 ]; then
# Insert delete logic here
echo "This file $file_name is dated $file_date.";
fi
done
@Kamalcp
Copy link

Kamalcp commented Dec 4, 2017

Hi Team,

usage="Usage: ./list-old-hdfs-files.sh [path] [days]"

what is the meaning for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment