Skip to content

Instantly share code, notes, and snippets.

@ross-spencer
Last active December 20, 2016 23:14
Show Gist options
  • Save ross-spencer/dfde95bb6589d01895bb7ef71f17f0b6 to your computer and use it in GitHub Desktop.
Save ross-spencer/dfde95bb6589d01895bb7ef71f17f0b6 to your computer and use it in GitHub Desktop.
Recipe to output duplicate files in a given directory...
#!/usr/bin/env bash
# 1. create hashes only
# 2. sort so duplicate lines follow each other
# 3. output duplicated checksums (one of each)
# 4. rerun hash tool in matching mode to output duplicates paths and checksums
# Explain Shells...
# ES1: http://explainshell.com/explain?cmd=sha1deep+-q+-r+-s+%22%24DIR%22+%7C+sort+%7C+uniq+-d+%3E+%22%24temp_file%22
# ES2: http://explainshell.com/explain?cmd=sha1deep+-r+-s+-M+%22%24temp_file%22+%22%24DIR%22
set -o errexit
set -o pipefail
set -o nounset
DIR="$1"
temp_file=$(mktemp "/tmp/dupes.sh.tmp.XXXXXX")
sha1deep -q -r -s "$DIR" | sort | uniq -d > "$temp_file" #ES1
sha1deep -r -s -M "$temp_file" "$DIR" #ES2
rm ${temp_file}
@ross-spencer
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment