Two scripts for compiling file statistics in a directory. The first script lists stats in lexicographic order of file extensions. The second script lists stats in descending order of frequency by extension.
Usage:
To obtain file statistics of the directory path-to-starting-directory
:
./filestats.sh [path-to-starting-directory]
Square brackets indicate that the directory parameter is optional. If no directory path is specified, then the scripts assume the starting directory is the directory where the scripts are located.
NOTE:
- Scripts require write access to the directory where the scripts are located. This requirement should be acceptable, because write permission is needed to copy the scripts to that directory.
- Scripts ignore every file without an extension.
TO-DO:
- Modify the scripts to compile stats without the need to create intermediate files
- Modify the scripts to include files without extension
#!/bin/bash
tmpfile="filestats_junk.txt"
workdir="`pwd`"
if [ -n "$1" ]
then
workdir="$1"
fi
echo `find "$workdir" -type f | grep "\." | sed 's/.*\.//' | grep -v " " | tr "[:upper:]" "[:lower:]" | sort -u` > "$tmpfile"
filetypelist=`cat "$tmpfile"`
rm "$tmpfile"
for i in $filetypelist
do
echo $i
find "$workdir" -type f -iname "*.""$i" | wc -l
done
#!/bin/bash
tmpfile="filestats_sorted_junk.txt"
oddfile="filestats_sorted_odd.txt"
evenfile="filestats_sorted_even.txt"
# Warning: filestats.sh must be in the same directory
sh filestats.sh > "$tmpfile"
cat "$tmpfile" | sed -n 'p;n' > "$oddfile"
cat "$tmpfile" | sed -n '1d;p;n' > "$evenfile"
paste "$evenfile" "$oddfile" | sort -r -g
rm "$tmpfile" "$evenfile" "$oddfile"