Skip to content

Instantly share code, notes, and snippets.

@godefroi
Created October 23, 2019 17:14
Show Gist options
  • Save godefroi/7927e33ea96702f6ccbb5ee94098ab97 to your computer and use it in GitHub Desktop.
Save godefroi/7927e33ea96702f6ccbb5ee94098ab97 to your computer and use it in GitHub Desktop.
# a list of paths to retrieve files from
$paths = @(
'databak\DCIM'
'DCIM'
)
# a dictionary to store the hashes
$hashes = new-object 'System.Collections.Generic.Dictionary`2[string, System.Collections.Generic.List`1[System.IO.FileInfo]]'
# compute the hashes of all the files
get-childitem -path $paths -recurse -file | foreach {
$hash = get-filehash -path $_ -algorithm 'MD5'
if( -not $hashes.ContainsKey($hash.Hash) ) {
$hashes.add($hash.Hash, (new-object 'System.Collections.Generic.List`1[System.IO.FileInfo]'))
}
$hashes[$hash.Hash].add($_)
}
# $hashes holds a dictionary where keys are hashes and values are lists of
# files having that hash.
# count duplicate files
$hashes.getenumerator() | where { $_.value.count -gt 1 } | measure-object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment