Skip to content

Instantly share code, notes, and snippets.

@renoirb
Last active August 2, 2017 03:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save renoirb/89b9fce3ab41dc08002a806e926d9282 to your computer and use it in GitHub Desktop.
Save renoirb/89b9fce3ab41dc08002a806e926d9282 to your computer and use it in GitHub Desktop.
Getting data about files on many Hard Drives
#!/bin/python
# Move to /usr/bin/filetype
# might require a mime module.
# TODO: find which pip module install was successful.
import sys
fileName = sys.argv[1]
# os.path.exists(media.encode('utf-8')
import os
stat = os.stat(fileName)
fileSize = stat.st_size
import hashlib
class FileHasher(object):
'''
Hash file contents
'''
def __init__(self, path, alg='sha1'):
self.path = path
if alg == 'sha1':
self.hasher = hashlib.md5()
elif alg == 'md5':
self.hasher = hashlib.sha1()
else:
raise ValueError('Unsupported Hashing algorithm')
with open(path, 'rb') as File:
while True:
data = File.read(0x100000)
if not data:
break
self.hasher.update(data)
def __str__(self):
return '{0}'.format(self.hasher.hexdigest())
fileHash = FileHasher(fileName, 'sha1')
## e.g.
#print stat
#posix.stat_result(st_mode=33279, st_ino=12602, st_dev=16657, st_nlink=1, st_uid=1024, st_gid=100, st_size=4109788, st_atime=1440119440, st_mtime=1287784330, st_ctime=1400614047)
from mimetypes import MimeTypes
mime = MimeTypes()
fileType = mime.guess_type(fileName)
print '"{}","{}","{}","{}"'.format(fileName, fileType[0], fileSize, fileHash)
#!/bin/bash
set -e
TIMESTAMP=$(date +%Y%m%d%H%M%S)
LOGFILEPATH="/volumeUSB3/Somewhere"
declare -a DIRECTORIES=(\
"volumeUSB1" \
"volumeUSB2" \
"volume1" \
)
declare -a EXT_MUSIC=(\
"mp3" \
"wma" \
"flac" \
"ogg" \
"ogg" \
"m4a" \
"oga" \
)
declare -a EXT_VIDEOS=(\
"mp4" \
"mkv" \
"flv" \
"avi" \
"wmv" \
"rm" \
"asf" \
"m4v" \
"mpg" \
"mpeg" \
"m2v" \
)
function join_by { local d=$1; shift; echo -n "$1"; shift; printf "%s" "${@/#/$d}"; }
#find /volumeUSB2 -type f -iregex ".*\.\($(join_by \\\| ${EXT_MUSIC[@]})\)" -print
#find /volumeUSB2 -type f -iregex '.*\.\(mp3\|wma\|m4a\)$'
for d in "${DIRECTORIES[@]}"; do
FILE="${LOGFILEPATH}music_${d}.${TIMESTAMP}.txt"
echo "At ${d} looking for ${EXT_MUSIC[@]}, logging to ${FILE}"
find "/${d}" -type f -iregex ".*\.\($(join_by \\\| ${EXT_MUSIC[@]})\)" -exec /usr/bin/filetype "{}" + >> "${FILE}"
FILE="${LOGFILEPATH}video_${d}.${TIMESTAMP}.txt"
echo "At ${d} looking for ${EXT_VIDEOS[@]}, logging to ${FILE}"
find "/${d}" -type f -iregex ".*\.\($(join_by \\\| ${EXT_VIDEOS[@]})\)" -exec /usr/bin/filetype "{}" + >> "${FILE}"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment