Skip to content

Instantly share code, notes, and snippets.

@zengargoyle
Created January 10, 2011 01:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zengargoyle/772219 to your computer and use it in GitHub Desktop.
Save zengargoyle/772219 to your computer and use it in GitHub Desktop.
delete files found in second directory
Usage: perl nukedups.pl $currentfolder $oldfolder
Run once with just the print and check output for sanity. If all looks good uncomment the 'unlink' line and re-run.
This code builds a list of the MD5 sums of the files in the first directory, then deletes the files in the second directory if their MD5 sum matches one of the sums seen in the first directory.
Follow with:
find $oldfolder -depth -type d -exec rmdir {} \;
to delete empty directories, directories with files remaining will give an error message but won't be deleted.
#!/usr/bin/perl
use strict;
use warnings;
use Digest::MD5;
use File::Find;
my %found;
# collect MD5 hashes of files in first directory given
File::Find::find(
sub {
-f $_ && do {
open my($f), $_;
my $ctx = Digest::MD5->new;
$ctx->addfile($f);
$found{ $ctx->digest } = ();
};
},
$ARGV[0]
);
# find files in second directory given, if their MD5 hash
# was seen in the first directory, do something about it.
File::Find::find(
sub {
-f $_ && do {
open my($f), $_;
my $ctx = Digest::MD5->new;
$ctx->addfile($f);
my $d = $ctx->digest;
print "$File::Find::name\n" if exists $found{ $d };
#unlink $_ if exists $found{ $d };
};
},
$ARGV[1]
);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment