Skip to content

Instantly share code, notes, and snippets.

@astarasikov
Last active August 29, 2015 14:18
Show Gist options
  • Save astarasikov/459314a2510dff957a8c to your computer and use it in GitHub Desktop.
Save astarasikov/459314a2510dff957a8c to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
require 'Digest'
require 'set'
if 0 == ARGV.length then
puts "Usage: $0 dir"
exit -1
end
def append_file(hashes, path)
md5 = Digest::MD5.file(path).hexdigest
hashes[md5] ||= Set.new
hashes[md5].add(path)
end
def scan_dir(hashes, top)
Dir.foreach(top) do |item|
next if item == '.' or item == '..'
file_path = top + '/' + item
$stderr.puts "AT #{file_path}"
next if (File.size file_path) > (50 << 20)
append_file(hashes, file_path) if File.file? file_path
scan_dir(hashes, file_path) if File.directory? file_path
end
end
def print_duplicates(hashes)
hashes.each_value do |val|
next if val.length <= 1
puts ""
val.each do |path|
puts path
end
end
end
@paths_by_hash = {}
scan_dir(@paths_by_hash, ARGV[0])
print_duplicates(@paths_by_hash)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment