Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
HDFS cron to delete old files in directory
now=$(date +%s);
# Loop through files
hdfs dfs -ls /some_hdfs_directory | while read f; do
# Get File Date and File Name
file_date=`echo $f | awk '{print $6}'`;
file_name=`echo $f | awk '{print $8}'`;
echo $file_date
echo $file_name
# Calculate Days Difference
difference=$(( ($now - $(date -d "$file_date" +%s)) / (24 * 60 * 60) ));
if [ $difference -gt $days_to_keep ]; then
echo "Deleting $file_name it is older than $days_to_keep and is dated $file_date.";
hdfs dfs -rm $file_name

This comment has been minimized.

Copy link

mishmam3 commented Mar 26, 2020

Thank you, this is very helpful 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.