Skip to content

Instantly share code, notes, and snippets.

@datacustodian
Last active June 24, 2017 14:42
Show Gist options
  • Save datacustodian/22bc2e75e6a0b10aa9cd8002440b58df to your computer and use it in GitHub Desktop.
Save datacustodian/22bc2e75e6a0b10aa9cd8002440b58df to your computer and use it in GitHub Desktop.
File Stats

File Stats bash Scripts

Two scripts for compiling file statistics in a directory. The first script lists stats in lexicographic order of file extensions. The second script lists stats in descending order of frequency by extension.

Usage:

To obtain file statistics of the directory path-to-starting-directory:

./filestats.sh [path-to-starting-directory]

Square brackets indicate that the directory parameter is optional. If no directory path is specified, then the scripts assume the starting directory is the directory where the scripts are located.

NOTE:

  • Scripts require write access to the directory where the scripts are located. This requirement should be acceptable, because write permission is needed to copy the scripts to that directory.
  • Scripts ignore every file without an extension.

TO-DO:

  • Modify the scripts to compile stats without the need to create intermediate files
  • Modify the scripts to include files without extension

filestats.sh

#!/bin/bash

tmpfile="filestats_junk.txt"

workdir="`pwd`"

if [ -n "$1" ]
then
  workdir="$1"
fi

echo `find "$workdir" -type f | grep "\." | sed 's/.*\.//' | grep -v " " | tr "[:upper:]" "[:lower:]" | sort -u` > "$tmpfile"

filetypelist=`cat "$tmpfile"`

rm "$tmpfile" 

for i in $filetypelist 
do
  echo $i 
  find "$workdir" -type f -iname "*.""$i" | wc -l
done

filestats_sorted.sh

#!/bin/bash

tmpfile="filestats_sorted_junk.txt"
oddfile="filestats_sorted_odd.txt"
evenfile="filestats_sorted_even.txt"

# Warning: filestats.sh must be in the same directory

sh filestats.sh > "$tmpfile"

cat "$tmpfile" | sed -n 'p;n' > "$oddfile"
cat "$tmpfile" | sed -n '1d;p;n' > "$evenfile"

paste "$evenfile" "$oddfile" | sort -r -g

rm "$tmpfile" "$evenfile" "$oddfile"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment