Skip to content

Instantly share code, notes, and snippets.

@midwire
Last active October 31, 2023 07:10
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save midwire/696748ac75912b5769ae94c3d38fe9b0 to your computer and use it in GitHub Desktop.
Save midwire/696748ac75912b5769ae94c3d38fe9b0 to your computer and use it in GitHub Desktop.
[Ruby script to find duplicate movies] Edit as necessary for your own needs. #ruby #utils
#!/usr/bin/env ruby
# Find duplicate movies with possible different extensions
require 'fileutils'
require 'colored'
include FileUtils
module FindDuplicateMovies
DIRECTORIES = [
'/Volumes/Media1/Movies'
]
MOVIE_FILETYPES = [
'.mp4',
'.mov',
'.m4v',
'.avi',
'.mkv',
'.mpg'
]
@file_count = 0
@file_array = []
def duplicate_count(array)
array.each_with_object(Hash.new(0)) do |value, hash|
hash[value] += 1
end.each_with_object([ ]) do |(value,count), result|
if (count > 1)
result << value
end
end
end
module_function :duplicate_count
DIRECTORIES.each do |directory|
Dir.glob(File.join(directory, '**', '*')).each do |current_path|
@file_count += 1
ext = File.extname(current_path)
basepath = current_path.gsub(/#{ext}$/, '')
next if File.directory?(current_path)
next unless MOVIE_FILETYPES.include?(ext)
@file_array << basepath
end
dupes = duplicate_count(@file_array)
if dupes.count > 0
puts('You have duplicates:'.yellow)
else
puts('You have no duplicates.'.green)
end
dupes.each do |basepath|
puts(basepath.yellow)
end
end
puts ">>> Analyzed #{@file_count} files".red
end
@midwire
Copy link
Author

midwire commented Jan 24, 2020

Docs

Here are some limited docs for those who need them.

Installation

  • Copy the code above to a file (preferably within your path)
  • Install Ruby - there are countless tutorials and examples on the web, on how to do this for your operating system if it doesn’t already have Ruby installed.
  • Install the 2 required gems
gem install fileutils
gem install colored
  • Modify the DIRECTORIES array to reflect where your movie libraries are located. If you are on Windows it will be something like C:/Path/To/Movies, etc.

Usage

ruby /path/to/find_duplicate_movies.rb

The script will simply collect all of the movie filenames (without extensions) into an array and count the occurrences. If there are more than 1 occurence, it qualifies as a duplicate.

Feedback

This script works perfectly for my needs, but if I can modify it to work better for you please let me know in the comments below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment