Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@mblarsen
Created May 11, 2015 04:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mblarsen/e24c73aa8004bb0bccfc to your computer and use it in GitHub Desktop.
Save mblarsen/e24c73aa8004bb0bccfc to your computer and use it in GitHub Desktop.
Removing duplicate files
#
# Removes duplicates from a directory based on checksum
# Run each line manually and check if it is the result you expect
#
# Based on: http://www.chriswrites.com/how-to-find-and-delete-duplicate-files-in-mac-os-x/#Terminal
# And: http://stackoverflow.com/a/1450288/204610
# Find duplicates and write to duplicates-report.txt
find . -type f -exec cksum {} \; | sort | tee /tmp/f.tmp | cut -f 1,2 -d ' ' | uniq -d | grep -hif /dev/stdin /tmp/f.tmp > duplicates-report.txt
# Delete duplicates based on duplicates-report.txt
# (Mac: `brew install gawk`)
sort -k1,1 duplicates-report.txt --stable | gawk -F' ' '{if ( $1==old ) { print $3 }; old=$1; }' > duplicates-only.txt | xargs rm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment