Skip to content

Instantly share code, notes, and snippets.

@smoser
Last active October 6, 2023 13:47
Show Gist options
  • Save smoser/787923b77d4c3651a7cc282aea78d4d3 to your computer and use it in GitHub Desktop.
Save smoser/787923b77d4c3651a7cc282aea78d4d3 to your computer and use it in GitHub Desktop.
backup or truncate deleted files

Find open files that have been deleted.

I ran into a problem where deleted files were taking up a considerable amount of space, and ultimately leading to filesystem full problems.

stack-overflow provided me with a way to get a list of open filehandles on deleted files here.

The key response there covers lsof -a +L1 which filters output to files that have less than 1 name (link count) in the filesystem.

fixup

The tool 'fixup' provided here allows you to easily 'show', 'truncate', or 'backup' the files.

Usage:

fixup-sizes [options] operation [args]

    operate on open-but-deleted files

    options:
      -m | --filter R    operate only on files that match R (default '.')
      -s | --min-size S  operate only on files larger than S (default 1048576

      -n | --dry-run     do not actually copy, only report what would be done

    operations:
       truncate
       show
       backup output-dir

    Example:
 
    * copy open filehandles on /data/log to /data2/deleted-files

      fixup-sizes --match=^/data/log backup /data2/deleted-files

    * truncate files matching /data/log that are larger than 10MiB

      fixup-sizes --dry-run --min-size=10485760 truncate

Example output:

$ vrun ./fixup --match pipewire --min-size=1 backup out.d
$ ./fixup --match pipewire --min-size=1 backup out.d
execute: mkdir out.d
execute: lsof -a +L1
execute: cp /proc/2264/fd/24 out.d/memfd:pipewire-memfd
execute: cp /proc/2264/fd/27 out.d/memfd:pipewire-memfd.2264
execute: cp /proc/2264/fd/30 out.d/memfd:pipewire-memfd.2264.1
execute: cp /proc/2264/fd/32 out.d/memfd:pipewire-memfd.2264.2
execute: cp /proc/2264/fd/34 out.d/memfd:pipewire-memfd.2264.3
#!/bin/bash
# shellcheck disable=SC2015,SC2039,SC2166,SC2162,SC3043
Usage() {
cat <<EOF
${0##*/} [options] operation [args]
operate on open-but-deleted files
options:
-m | --filter R operate only on files that match R (default '.')
-s | --min-size S operate only on files larger than S (default $((1024*1024))
-n | --dry-run do not actually copy, only report what would be done
operations:
truncate
show
backup output-dir
Example:
* copy open filehandles on /data/log to /data2/deleted-files
${0##*/} --match=^/data/log backup /data2/deleted-files
* truncate files matching /data/log that are larger than 10MiB
${0##*/} --dry-run --min-size=$((10*1024*1024)) truncate
EOF
}
bad_Usage() {
Usage >&2
stderr "$@"
return 1
}
cleanup() {
[ -z "${TEMP_D}" -o ! -d "${TEMP_D}" ] || rm -Rf "${TEMP_D}"
}
# dryexe(cmd, args)
# "dryrun execute" - execute the command if not in dryrun.
dryexe() {
if [ "$DRY" = "true" ]; then
stderr "would execute:" "$@"
return 0
fi
stderr "execute:" "$@"
"$@"
}
backup() {
local dest="" n=1 destbase=""
[ -n "$OUTD" ] || fail "OUTD not set"
dest="${OUTD}/${name#/}"
# there may be multiple files open that *had* the same path
# so there might be existing files from this run.
# if so, first try appending '.<pid>' and then '.<pid>.<n>'
if [ -e "$dest" ]; then
dest="${dest}.${pid}"
if [ -f "$dest" ]; then
destbase="${dest}"
while dest="$destbase.$n" && [ -f "$dest" ]; do
n=$((n+1))
done
fi
fi
[ -f "$fdpath" ] || {
stderr "fdpath $fdpath did not exist: pid=$pid name=$name"
return 1
}
set -- cp "$fdpath" "$dest"
if [ "$DRY" = "true" ]; then
stderr "would execute:" "$@"
return 0
fi
stderr "execute:" "$@"
[ -d "${dest%/*}" ] || mkdir -p "${dest%/*}" || {
stderr "failed to create dir for $dest"
return 1
}
"$@"
}
truncate() {
[ -n "$fdpath" ] || {
stderr "empty fdpath var: $nline"
return 1
}
print "#" 1>&2
dryexe truncate --size=0 "$fdpath"
}
print() {
printf "%s%7d %10d %-12s %s\n" "$1" "$pid" "$size" \
"${fdpath#/proc/}" "$name"
}
isnum() {
[ -n "$1" ] && [ "${1#*[!0-9]}" = "$1" ]
}
process() {
# shellcheck disable=SC2034
local line nline cmd pid user fd ftype dev size nlink node name
local fdpath="" fdo=""
local op="$1"
while read line; do
# drop empty lines entirely
[ -z "$line" ] && continue
# drop the first (header) line
[ "${line#COMMAND}" = "${line}" ] || continue
# drop 'deleted' / attempt to allow spaces in filenames
nline=${line% (deleted)}
[ "$nline" = "$line" ] && {
stderr "record missing '(deleted)': $line"
continue
}
# shellcheck disable=SC2086
set -- ${nline}
[ $# -eq 10 ] || {
stderr "found $# fields in line. expected 10: ${nline}"
continue
}
# shellcheck disable=SC2034
{
cmd="$1"; pid="$2"; user="$3"; fdo="$4"; ftype="$5"
dev="$6"; size="$7"; nlink="$8"; node="$9"; name="${10}"
}
# fdo has a mode ('u', 'w', 'r' and maybe other fields)
# right-trim non-digit numbers
fd=${fdo}
while [ "${fd%[^0-9]}" != "${fd}" ]; do
fd=${fd%[^0-9]}
done
fdpath="/proc/$pid/fd/$fd"
[ "$op" = "show" ] && {
echo "cmd=$cmd pid=$pid fd=$fd size=$size name=$name"
continue
}
[ "${name%/}" = "$name" ] || {
stderr "name field '$name' not absolute? ${nline}"
continue
}
echo "$name" | grep -q "$MATCH" || {
# stderr "skipping $name: did not match"
continue
}
isnum "$size" || {
stderr "skipping (size '$size' not a number) pid=$pid name=$name"
continue
}
[ "$size" -gt "$MINSIZE" ] || {
stderr "skipping (size $size < $MINSIZE) pid=$pid name=$name"
continue
}
"$op" || fail "op $op failed on pid=$pid name=$name: ${nline}"
done
}
getnum() {
local input="$1" n="" unit=1
n=${input%B}
case "$n" in
*K) n=${n%K}
unit=1024;;
*M) n=${n%M}
unit=$((1024*1024));;
*G) n=${n%G}
unit=$((1024*1024*1024));;
esac
isnum "$n" || {
stderr "input '$input' not understood as a number";
return
}
echo "$((n*unit))"
}
ensure_d() {
local dir="$1"
[ -d "$dir" ] && return 0
[ -e "$dir" ] && {
stderr "$dir exists but is not a dir"
return 1
}
[ -L "$dir" ] && {
stderr "$dir is a dangling symlink"
return 1
}
dryexe mkdir "$dir"
}
stderr() { echo "$@" 1>&2; }
fail() { [ $# -eq 0 ] || stderr "ERROR:" "$@"; exit 1; }
main() {
local sopts="hi:m:ns:"
local lopts="help,input:,match:,min-size:,dry-run"
local name="${0##*/}" out=""
out=$(getopt --name "$name" \
--options "$sopts" --long "$lopts" -- "$@") &&
eval set -- "$out" ||
{ bad_Usage; return; }
local cur="" next="" dry=false lsofout="" match="."
local minsize="$((1024*1024))"
while [ $# -ne 0 ]; do
cur="$1"; next="$2";
case "$cur" in
-h|--help) Usage ; exit 0;;
-i|--input) lsofout="$next"; shift;;
-s|--min-size) minsize=$next; shift;;
-m|--match) match=$next; shift;;
-n|--dry-run) dry=true;;
--) shift; break;;
esac
shift;
done
[ $# -ge 1 ] ||
{ bad_Usage "must give args"; return; }
#[ "$(id -u)" = "0" ] || { stderr "Must be root."; return 1; }
local op="$1"
shift
out=$(getnum "${minsize}") ||
fail "could not parse --min-size=$minsize"
minsize="$out"
MINSIZE="$minsize"
MATCH="$match"
DRY="$dry"
case "$op" in
truncate|show|print)
[ $# -eq 0 ] ||
{ bad_Usage "truncate got $# args"; return 1; }
;;
backup)
[ $# -eq 1 ] || fail "backup needs 1 arg [output-dir] got $#"
OUTD=${1%/}
[ -n "$OUTD" ] || fail "output-dir cannot be empty string"
ensure_d "$OUTD" || fail "could not create $OUTD"
;;
*) fail "unknown operation $op";;
esac
TEMP_D=$(mktemp -d "${TMPDIR:-/tmp}/${0##*/}.XXXXXX") ||
fail "failed to make tempdir"
trap cleanup EXIT
if [ -z "$lsofout" ]; then
lsofout=${TEMP_D}/lsof.out
lsoferr=${TEMP_D}/lsof.err
stderr "execute:" "lsof -a +L1"
lsof -a +L1 > "${lsofout}" 2>"${lsoferr}" || {
cat "$lsoferr" 1>&2
fail "lsof failed"
}
else
[ -f "$lsofout" ] || fail "input file '$lsofout' is not a file"
fi
process "$op" < "$lsofout" || return
}
main "$@"
@joylatten
Copy link

is it ok that "-i" is missing from the Usage?

@smoser
Copy link
Author

smoser commented Oct 5, 2023

is it ok that "-i" is missing from the Usage?

i did that intentionally, probably because its not really a good idea i think. but i thought that'd be useful for testing , which is why it is there.

i'd use it with '--dry-run' , but if you had a stale file then pids could be gone or you're in other wasy operating on bad input data.

that is still a problem, as you have some time between "lsof" and "process".

@joylatten
Copy link

ahhh ok. Also, I think we trim fd twice... line #124 and #152.

@smoser
Copy link
Author

smoser commented Oct 6, 2023

ahhh ok. Also, I think we trim fd twice... line #124 and #152.

fixed. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment