Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ninmonkey/f5b1ce0c44378ecd2a87a4503c6acc30 to your computer and use it in GitHub Desktop.
Save ninmonkey/f5b1ce0c44378ecd2a87a4503c6acc30 to your computer and use it in GitHub Desktop.
Powershell Performance -- Adding elements to arrays is **very** slow

Powershell Performance: Adding to arrays is slow

powershell array performance results

A common pattern in many languages adds an element to an array or list

elements = []
for x in range(10000):
    elements.append(x)

The equivalent in Powershell is very costly

$elements = @()
0..10000 | foreach-object {
    $elements += $_
}

Why is it so bad ?!

Array literals in PowerShell use the type System.Array which cannot increase size

This means "appending" to an array past capacity causes it to allocate an entirely new array -- then copying the full contents. Allocating memory is expensive.

Arrays in other languages like Python or JavaScript are more closely related to [Collections.Generic.List] in powershell or std::vector<t> in c++

They allocate arrays larger than the current size needed. This means instead of 17,000 allocations -- you end up with far, far, fewer.

See more

For details on when to use ArrayList verses List<T> and when to use HashTable vs Dictionary<TKey, TValue> see:

using namespace System.Collections.Generic
function Format-Results {
<#
.Description
Nicely format results of multiple Measure-Commands
output:
Id TotalSec TotalMs Test
-- -------- ------- ----
2 10.541 10541.370 standard array
1 0.303 302.691 pipeline array
0 0.292 292.105 List[string]
#>
param(
[Parameter(Mandatory, HelpMessage = "Hashtable of Measure-Command results")]
[hashtable]$results
)
$id = 0
$results.Keys | ForEach-Object {
$key = $_
[pscustomobject][ordered]@{
Test = $key
TotalMs = '{0,10:f3}' -f $results.$key.TotalMilliseconds
TotalMsRaw = $results.$key.TotalMilliseconds
TotalSec = '{0,8:f3}' -f $results.$key.TotalSeconds
Id = $id++
}
} | Sort-Object TotalMsRaw -Descending
| Format-Table Id, TotalSec, TotalMs, Test
}
$results = [ordered]@{}
$ls_all = Get-ChildItem ~ -Depth 4
$results['standard array'] = Measure-Command {
$FilesImages = @()
$ls_all | ForEach-Object {
if ($_.Extension -match 'png|jpg') {
$FilesImages += $_
}
}
}
$results['pipeline array'] = Measure-Command {
$FilesImages = $ls_all | ForEach-Object {
if ($_.Extension -match 'png|jpg') {
$_
}
}
}
# at this scale ( ~17k adds ) it's almost equal to implicit pipes,
# even with: [list[string]]::new($FilesImages.Count)
$results['List[string]'] = Measure-Command {
$ImageNames = [List[string]]::new()
$ls_all | ForEach-Object {
if ($_.Extension -match 'png|jpg') {
$ImageNames.Add($_)
}
} | Out-Null
}
"Of {0:n0} files, {1:n0} matched" -f $ls_all.Count, $FilesImages.Count | Write-Host -ForegroundColor Green
Format-Results -results $results
"Pipeline was {0:f2} times faster than standard array" -f ( $results['standard array'] / $results['pipeline array']) | Write-Host -ForegroundColor Red
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment