Skip to content

Instantly share code, notes, and snippets.

@ehedaya
Last active March 23, 2019 14:37
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ehedaya/0374e688691c756ed4a1 to your computer and use it in GitHub Desktop.
Save ehedaya/0374e688691c756ed4a1 to your computer and use it in GitHub Desktop.
Simple script to go through all the files in a folder and move any that are identical to a "duplicates" folder. Identical is defined as having the same md5 hash. I used this to save space in a folder that had multiple copies of the same video clips I shot on my iPhone.
# Make a directory to hold any duplicates we find; this may result in "directory already exists" if running a second time.
mkdir duplicates
# Create an empty log file to hold hashes so we know which files we have seen before
log=/tmp/md5copylog-`date +%s`.log
touch $log
# For any file (can change this to *.MOV to do just .MOV files, for example)
for f in *.*;
do
# Generate an md5 hash
m=`md5 -q "$f"`
# Check if it's in the log file
if grep -q $m $log
then
# If so, move it to the duplicates folder. This will only happen for files we've seen once before
echo "[ DUPE ] $m - $f - Moving to duplicates folder"
mv "$f" duplicates/
else
# Haven't seen this one yet, leave it where it is
echo "[ UNIQUE ] $m - $f"
echo $m >> $log
fi
done
[ UNIQUE ] f08a8a3d623ceb2031d315aae854591e - 2015-06-26 19.30.55.mov
[ UNIQUE ] 122657da10c46d57e6829a4c4230cc2b - 2015-06-26 20.54.58.mov
[ UNIQUE ] b249f3386c9887d01e8f1164e612b194 - 2015-06-27 11.39.05.mov
[ UNIQUE ] 0e7b900b6be21b8ace4f8ee706535110 - 2015-06-27 11.52.48.mov
[ UNIQUE ] 26c98050c6a18dba617cc4178d35c862 - 2015-06-27 11.57.43.mov
[ DUPE ] 26c98050c6a18dba617cc4178d35c862 - 2015-06-27 11.58.25.mov - Moving to duplicates folder
[ UNIQUE ] 7af2cdd1dde71f7f1098e5f418ce3d15 - 2015-06-27 12.13.46.mov
[ UNIQUE ] 0f4ab9bdb6f10e431485a1a3dfb76ded - 2015-06-27 13.28.14.mov
[ DUPE ] 0f4ab9bdb6f10e431485a1a3dfb76ded - 2015-06-27 13.28.49.mov - Moving to duplicates folder
[ UNIQUE ] be6e01416f822aa74fe824d4b3c6295d - 2015-06-27 13.34.38.mov
[ UNIQUE ] 68bb17a8af073bf3954d6c4c4859de5b - 2015-06-27 13.36.26.mov
[ DUPE ] 68bb17a8af073bf3954d6c4c4859de5b - 2015-06-27 13.36.51.mov - Moving to duplicates folder
[ UNIQUE ] 0e393e7569b1c4e7aeb301eac9e16fbd - 2015-06-27 13.40.42.mov
[ UNIQUE ] 822f40611af9cd6e5e1826ac213b1ce9 - 2015-06-27 18.11.33.mov
[ UNIQUE ] f3ca625aedeb0b666a5eb0d8ddfad169 - 2015-06-28 14.45.53.mov
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment