Skip to content

Instantly share code, notes, and snippets.

@heaths
Created August 15, 2012 10:30
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save heaths/3358559 to your computer and use it in GitHub Desktop.
Save heaths/3358559 to your computer and use it in GitHub Desktop.
Select-Unique
function Select-Unique
{
[CmdletBinding()]
param
(
[Parameter(Mandatory=$true, Position=0)]
[string[]] $Property,
[Parameter(Mandatory=$true, ValueFromPipeline=$true)]
$InputObject,
[Parameter()]
[switch] $AsHashtable,
[Parameter()]
[switch] $NoElement
)
begin
{
$Keys = @{}
}
process
{
$InputObject | foreach-object {
$o = $_
$k = $Property | foreach-object -begin {
$s = ''
} -process {
# Delimit multiple properties like group-object does.
if ( $s.Length -gt 0 )
{
$s += ', '
}
$s += $o.$_ -as [string]
} -end {
$s
}
if ( -not $Keys.ContainsKey($k) )
{
$Keys.Add($k, $null)
if ( -not $AsHashtable )
{
$o
}
elseif ( -not $NoElement )
{
$Keys[$k] = $o
}
}
}
}
end
{
if ( $AsHashtable )
{
$Keys
}
}
}
@heaths
Copy link
Author

heaths commented Aug 15, 2012

Select-Unique

PowerShell's built-in select-object cmdlet has a -unique parameter that introduces significant performance issues and loss of information. My select-unique cmdlet works around those issues.

Performance

The performance issue is because select-object enumerates and collects all objects before selecting unique objects based on the objects' properties you specify. select-unique ouputs objects while enumerating and keeps track of which unique combination of objects' properties has already been enumerated.

Information loss

Because select-object only outputs those properties, you can't select unique objects and output all properties. select-unique is more like a filter.

Comparison

Consider that you want to see all the files with unique file extensions in a directory recursively. select-object -unique would output only the extensions:

> get-childitem -recurse | select-object -unique Extension

Extension
-------------

.tmp
.bak
...

select-unique will, however, output all properties (while maintaining the default type view) and will output objects as they are processed through the pipeline, allowing downstream cmdlets to process those objects as they come.

> get-childitem -recurse | select-unique Extension

    Directory: C:\


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         8/14/2012  11:08 AM       4284 foo.tmp
-a---         8/14/2012  11:12 AM       8392 bar.bak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment