Skip to content

Instantly share code, notes, and snippets.

@smoser
Last active October 6, 2023 13:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save smoser/787923b77d4c3651a7cc282aea78d4d3 to your computer and use it in GitHub Desktop.
Save smoser/787923b77d4c3651a7cc282aea78d4d3 to your computer and use it in GitHub Desktop.
backup or truncate deleted files

Find open files that have been deleted.

I ran into a problem where deleted files were taking up a considerable amount of space, and ultimately leading to filesystem full problems.

stack-overflow provided me with a way to get a list of open filehandles on deleted files here.

The key response there covers lsof -a +L1 which filters output to files that have less than 1 name (link count) in the filesystem.

fixup

The tool 'fixup' provided here allows you to easily 'show', 'truncate', or 'backup' the files.

Usage:

fixup-sizes [options] operation [args]

    operate on open-but-deleted files

    options:
      -m | --filter R    operate only on files that match R (default '.')
      -s | --min-size S  operate only on files larger than S (default 1048576

      -n | --dry-run     do not actually copy, only report what would be done

    operations:
       truncate
       show
       backup output-dir

    Example:
 
    * copy open filehandles on /data/log to /data2/deleted-files

      fixup-sizes --match=^/data/log backup /data2/deleted-files

    * truncate files matching /data/log that are larger than 10MiB

      fixup-sizes --dry-run --min-size=10485760 truncate

Example output:

$ vrun ./fixup --match pipewire --min-size=1 backup out.d
$ ./fixup --match pipewire --min-size=1 backup out.d
execute: mkdir out.d
execute: lsof -a +L1
execute: cp /proc/2264/fd/24 out.d/memfd:pipewire-memfd
execute: cp /proc/2264/fd/27 out.d/memfd:pipewire-memfd.2264
execute: cp /proc/2264/fd/30 out.d/memfd:pipewire-memfd.2264.1
execute: cp /proc/2264/fd/32 out.d/memfd:pipewire-memfd.2264.2
execute: cp /proc/2264/fd/34 out.d/memfd:pipewire-memfd.2264.3
#!/bin/bash
# shellcheck disable=SC2015,SC2039,SC2166,SC2162,SC3043
Usage() {
cat <<EOF
${0##*/} [options] operation [args]
operate on open-but-deleted files
options:
-m | --filter R operate only on files that match R (default '.')
-s | --min-size S operate only on files larger than S (default $((1024*1024))
-n | --dry-run do not actually copy, only report what would be done
operations:
truncate
show
backup output-dir
Example:
* copy open filehandles on /data/log to /data2/deleted-files
${0##*/} --match=^/data/log backup /data2/deleted-files
* truncate files matching /data/log that are larger than 10MiB
${0##*/} --dry-run --min-size=$((10*1024*1024)) truncate
EOF
}
bad_Usage() {
Usage >&2
stderr "$@"
return 1
}
cleanup() {
[ -z "${TEMP_D}" -o ! -d "${TEMP_D}" ] || rm -Rf "${TEMP_D}"
}
# dryexe(cmd, args)
# "dryrun execute" - execute the command if not in dryrun.
dryexe() {
if [ "$DRY" = "true" ]; then
stderr "would execute:" "$@"
return 0
fi
stderr "execute:" "$@"
"$@"
}
backup() {
local dest="" n=1 destbase=""
[ -n "$OUTD" ] || fail "OUTD not set"
dest="${OUTD}/${name#/}"
# there may be multiple files open that *had* the same path
# so there might be existing files from this run.
# if so, first try appending '.<pid>' and then '.<pid>.<n>'
if [ -e "$dest" ]; then
dest="${dest}.${pid}"
if [ -f "$dest" ]; then
destbase="${dest}"
while dest="$destbase.$n" && [ -f "$dest" ]; do
n=$((n+1))
done
fi
fi
[ -f "$fdpath" ] || {
stderr "fdpath $fdpath did not exist: pid=$pid name=$name"
return 1
}
set -- cp "$fdpath" "$dest"
if [ "$DRY" = "true" ]; then
stderr "would execute:" "$@"
return 0
fi
stderr "execute:" "$@"
[ -d "${dest%/*}" ] || mkdir -p "${dest%/*}" || {
stderr "failed to create dir for $dest"
return 1
}
"$@"
}
truncate() {
[ -n "$fdpath" ] || {
stderr "empty fdpath var: $nline"
return 1
}
print "#" 1>&2
dryexe truncate --size=0 "$fdpath"
}
print() {
printf "%s%7d %10d %-12s %s\n" "$1" "$pid" "$size" \
"${fdpath#/proc/}" "$name"
}
isnum() {
[ -n "$1" ] && [ "${1#*[!0-9]}" = "$1" ]
}
process() {
# shellcheck disable=SC2034
local line nline cmd pid user fd ftype dev size nlink node name
local fdpath="" fdo=""
local op="$1"
while read line; do
# drop empty lines entirely
[ -z "$line" ] && continue
# drop the first (header) line
[ "${line#COMMAND}" = "${line}" ] || continue
# drop 'deleted' / attempt to allow spaces in filenames
nline=${line% (deleted)}
[ "$nline" = "$line" ] && {
stderr "record missing '(deleted)': $line"
continue
}
# shellcheck disable=SC2086
set -- ${nline}
[ $# -eq 10 ] || {
stderr "found $# fields in line. expected 10: ${nline}"
continue
}
# shellcheck disable=SC2034
{
cmd="$1"; pid="$2"; user="$3"; fdo="$4"; ftype="$5"
dev="$6"; size="$7"; nlink="$8"; node="$9"; name="${10}"
}
# fdo has a mode ('u', 'w', 'r' and maybe other fields)
# right-trim non-digit numbers
fd=${fdo}
while [ "${fd%[^0-9]}" != "${fd}" ]; do
fd=${fd%[^0-9]}
done
fdpath="/proc/$pid/fd/$fd"
[ "$op" = "show" ] && {
echo "cmd=$cmd pid=$pid fd=$fd size=$size name=$name"
continue
}
[ "${name%/}" = "$name" ] || {
stderr "name field '$name' not absolute? ${nline}"
continue
}
echo "$name" | grep -q "$MATCH" || {
# stderr "skipping $name: did not match"
continue
}
isnum "$size" || {
stderr "skipping (size '$size' not a number) pid=$pid name=$name"
continue
}
[ "$size" -gt "$MINSIZE" ] || {
stderr "skipping (size $size < $MINSIZE) pid=$pid name=$name"
continue
}
"$op" || fail "op $op failed on pid=$pid name=$name: ${nline}"
done
}
getnum() {
local input="$1" n="" unit=1
n=${input%B}
case "$n" in
*K) n=${n%K}
unit=1024;;
*M) n=${n%M}
unit=$((1024*1024));;
*G) n=${n%G}
unit=$((1024*1024*1024));;
esac
isnum "$n" || {
stderr "input '$input' not understood as a number";
return
}
echo "$((n*unit))"
}
ensure_d() {
local dir="$1"
[ -d "$dir" ] && return 0
[ -e "$dir" ] && {
stderr "$dir exists but is not a dir"
return 1
}
[ -L "$dir" ] && {
stderr "$dir is a dangling symlink"
return 1
}
dryexe mkdir "$dir"
}
stderr() { echo "$@" 1>&2; }
fail() { [ $# -eq 0 ] || stderr "ERROR:" "$@"; exit 1; }
main() {
local sopts="hi:m:ns:"
local lopts="help,input:,match:,min-size:,dry-run"
local name="${0##*/}" out=""
out=$(getopt --name "$name" \
--options "$sopts" --long "$lopts" -- "$@") &&
eval set -- "$out" ||
{ bad_Usage; return; }
local cur="" next="" dry=false lsofout="" match="."
local minsize="$((1024*1024))"
while [ $# -ne 0 ]; do
cur="$1"; next="$2";
case "$cur" in
-h|--help) Usage ; exit 0;;
-i|--input) lsofout="$next"; shift;;
-s|--min-size) minsize=$next; shift;;
-m|--match) match=$next; shift;;
-n|--dry-run) dry=true;;
--) shift; break;;
esac
shift;
done
[ $# -ge 1 ] ||
{ bad_Usage "must give args"; return; }
#[ "$(id -u)" = "0" ] || { stderr "Must be root."; return 1; }
local op="$1"
shift
out=$(getnum "${minsize}") ||
fail "could not parse --min-size=$minsize"
minsize="$out"
MINSIZE="$minsize"
MATCH="$match"
DRY="$dry"
case "$op" in
truncate|show|print)
[ $# -eq 0 ] ||
{ bad_Usage "truncate got $# args"; return 1; }
;;
backup)
[ $# -eq 1 ] || fail "backup needs 1 arg [output-dir] got $#"
OUTD=${1%/}
[ -n "$OUTD" ] || fail "output-dir cannot be empty string"
ensure_d "$OUTD" || fail "could not create $OUTD"
;;
*) fail "unknown operation $op";;
esac
TEMP_D=$(mktemp -d "${TMPDIR:-/tmp}/${0##*/}.XXXXXX") ||
fail "failed to make tempdir"
trap cleanup EXIT
if [ -z "$lsofout" ]; then
lsofout=${TEMP_D}/lsof.out
lsoferr=${TEMP_D}/lsof.err
stderr "execute:" "lsof -a +L1"
lsof -a +L1 > "${lsofout}" 2>"${lsoferr}" || {
cat "$lsoferr" 1>&2
fail "lsof failed"
}
else
[ -f "$lsofout" ] || fail "input file '$lsofout' is not a file"
fi
process "$op" < "$lsofout" || return
}
main "$@"
@raharper
Copy link

raharper commented Oct 5, 2023

Line 117; the field is size/offset; so there is some subtlety to the value (if present)

Keep points from man 8 lsof on SIZE, SIZE/OFF, or OFFSET

  • field may be empty
  • file size displayed in decimal
  • file offset displayed in hex (0x...)
  • caller can invoke lsof -s to only display size (or empty if no size, like a socket file)

@smoser
Copy link
Author

smoser commented Oct 5, 2023

field may be empty
that is obnoxious. not sure how to deal with that. toughts?

it will currently get skipped (i think) because that line wont' have 10 fields.
I'll fix it to just skip anything with a non-number value there.

@joylatten
Copy link

is it ok that "-i" is missing from the Usage?

@smoser
Copy link
Author

smoser commented Oct 5, 2023

is it ok that "-i" is missing from the Usage?

i did that intentionally, probably because its not really a good idea i think. but i thought that'd be useful for testing , which is why it is there.

i'd use it with '--dry-run' , but if you had a stale file then pids could be gone or you're in other wasy operating on bad input data.

that is still a problem, as you have some time between "lsof" and "process".

@joylatten
Copy link

ahhh ok. Also, I think we trim fd twice... line #124 and #152.

@smoser
Copy link
Author

smoser commented Oct 6, 2023

ahhh ok. Also, I think we trim fd twice... line #124 and #152.

fixed. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment