Skip to content

Instantly share code, notes, and snippets.

@alexec
Created January 31, 2016 18:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alexec/2bd45941f3e3594c5690 to your computer and use it in GitHub Desktop.
Save alexec/2bd45941f3e3594c5690 to your computer and use it in GitHub Desktop.
De-duplicate a directory of files
#! /bin/bash
set -eu
DB=~/.dedup
DUPS=dups
if [ ! -e $DB ]; then
mkdir $DB
fi
if [ ! -e $DUPS ]; then
mkdir $DUPS
fi
find . -type f | while read F ; do
printf "%-80s\r" "$F"
NAME=$(basename "$F")
SIZE=$(stat -f %z "$F")
HASH=$(cat "$F" | openssl sha1)
HASH_FILE="$DB/$HASH"
if [ -e "$HASH_FILE" ]; then
EXPECTED=$(cat "$HASH_FILE")
if [ "$EXPECTED" != "$F" ]; then
echo "$F duplicates $EXPECTED"
mv -f "$F" $DUPS/
fi
else
echo "$F" > "$HASH_FILE"
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment