Skip to content

Instantly share code, notes, and snippets.

@milothiesen
Forked from costa/find-duplicate-files.rb
Last active June 7, 2016 14:22
Show Gist options
  • Save milothiesen/146e4d9e9f54dc0bfbc02d104960dbf2 to your computer and use it in GitHub Desktop.
Save milothiesen/146e4d9e9f54dc0bfbc02d104960dbf2 to your computer and use it in GitHub Desktop.
# Define the unique method that removes duplicates
#!/usr/bin/ruby
require 'digest/md5'
library_path = ARGV[0]
hash = {}
Dir.glob(library_path + "/**/*", File::FNM_DOTMATCH).each do |filename|
next if File.directory?(filename)
# puts 'Checking ' + filename
key = Digest::MD5.hexdigest(IO.read(filename)).to_sym
if hash.has_key? key
# puts "same file #{filename}"
hash[key].push filename
else
hash[key] = [filename]
end
end
hash.each_value do |filename_array|
if filename_array.length > 1
puts "=== Identical Files ===\n"
filename_array.each { |filename| puts ' '+filename }
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment