Skip to content

Instantly share code, notes, and snippets.

@jmitchener
Created September 22, 2010 08:28
Show Gist options
  • Save jmitchener/591356 to your computer and use it in GitHub Desktop.
Save jmitchener/591356 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
BEGIN {$VERBOSE = true} # enable warnings (-w)
class MusicHash < Hash
def initialize(file)
parse_file file
end
private
def parse_file(file)
File.open(file) do |f|
f.each_line do |l|
(md5, name) = l.match(/^([\w]+)\s+(.*)$/).captures
self[md5] = name
end
end
end
end
unsorted = MusicHash.new('unsorted_md5s')
sorted = MusicHash.new('sorted_md5s')
duplicate_count = 0
unsorted.each do |md5, file|
if sorted.has_key? md5
puts "#{file} is a duplicate"
duplicate_count += 1
end
end
puts "Found #{duplicate_count} duplicates out of #{unsorted.size} files."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment