Skip to content

Instantly share code, notes, and snippets.

@mlzxy
Last active March 23, 2022 08:21
Show Gist options
  • Save mlzxy/8f5b9d8266690298663f259ce583563b to your computer and use it in GitHub Desktop.
Save mlzxy/8f5b9d8266690298663f259ce583563b to your computer and use it in GitHub Desktop.
Yet another `rm` alias

SAFE-RM

It behaves identically asrm except that it won't delete files with important or .keep (customizable through SAFE_RM_IMPORTANCE_MARK ). The idea is to mark important files beforehand so they can always survive future disk cleanup.

What triggered me to create this is I accidentally removed some baseline models, which I spent lots of GPU hours to train and all my current training jobs depend on. To prevent future disasters, I come up with this solution. Hope you find it useful too.

Demo

> tree .  # before deletion
├── checkpoint
│   ├── baseline
│   │   ├── log.txt
│   │   └── model.pt
│   ├── finetuned
│   │   ├── log.txt
│   │   └── model.pt
│   ├── log.txt
│   └── note
│       ├── _record
│       │   ├── __init__.py
│       │   └── important_file
│       ├── log1.txt
│       └── log2.txt
├── main.cc
├── model.pt
└── note.keep

> rm -rf ./* # deletion
skip ./checkpoint/note/_record/important_file
skip ./checkpoint/baseline/model.pt
skip ./checkpoint/baseline/log.txt
skip ./note.keep

> tree . # after deletion, important files are preserved
├── checkpoint
│   ├── baseline
│   │   ├── log.txt
│   │   └── model.pt
│   └── note
│       └── _record
│           └── important_file
└── note.keep
export SAFE_RM_IMPORTANCE_MARK=".keep baseline important" # separated by space
safe-rm() {
mark=${SAFE_RM_IMPORTANCE_MARK:-".keep important"}
mark_grep=${mark//\ /"\|"} # for use with grep
mark_regex=${mark//\ /"|"} # for use with regex
local flag # store cli options like `-rf`
local reach="" # flag variable to check if any files are provided
for a in "$@"; do
if [[ $a == -* ]]; then # store cli options
flag="$flag $a"
else
reach="1" # mark that at least one file is provided
if [[ $a =~ $mark_regex ]]; then
# if [[ $a =~ "baseline" || $a =~ ".keep" ]]; then
# this file / folder has been marked as important, skip deletion
echo "skip $a"
else
f=${a// /\\ } # escape for space charactor
if [[ $flag =~ "r" && -d $f ]]; then
# if $f is a folder and `-r` option presents, then we use `find` to go over all files inside $f
# and delete every file except those with importance mark
eval 'find $f -type file | grep -v "${mark_grep}" | xargs /bin/rm'
# then delete all empty folders inside $f
eval "find $f -type d -empty -print -delete"
# check if $f still contains file
if [ "$(find $f -type file)" ]; then
: # yes, then skip deleting $f
find $f -type file | xargs -I {} echo "skip {}"
else
# no, $f is empty, then delete $f
eval "/bin/rm $flag $f"
fi
else
# a file without importance mark, delete it
eval "/bin/rm $flag $f"
fi
fi
fi
done
# if no files are provided, print help messages just like `rm`
if [[ "$reach" == "" ]]; then
eval "/bin/rm $flag"
fi
}
ils() { # list all important files
mark=${SAFE_RM_IMPORTANCE_MARK:-".keep important"}
find $1 -type file | grep ${mark//\ /"\|"}
}
alias rm="safe-rm"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment