How would you count the "Lines of Code" in HowManyLines.ps1 below? Go ahead and look, I'll wait here. 😉 Do you count lines that have nothing but braces on them? Do you count opening and closing braces? Do you count the "else" line?
If we use the Language Parser to count lines, we can take every statement's begin and end line and then count how many unique lines have code on them. But suprisingly, the else
keyword doesn't show up as a token, because it's just part of the IfStatementAst
...
$Ast = [System.Management.Automation.Language.Parser]::ParseFile((Convert-Path "HowManyLines.ps1"), [ref]$Null, [ref]$Null)
$Ast.FindAll({$Args[0] -ne $Ast}, $true) | % { $_.Extent.StartLineNumber, $_.Extent.EndLineNumber } | Sort -Unique
That is, the count will be 12, and the unique line numbers counted are:
2
3
5
6
8
9
11
12
14
15
16
17
Side note: the condition $Args[0] -ne $Ast
in the script above is there to avoid outputting the "file" as a token, which would result in always counting the first line, even if it's empty or a comment.
Frankly, I'm not too worried about the else
not being counted on it's own (I wouldn't write it that way myself anyway), but I am concerned about the fact that when I paste that into my VS Code (and it gets automatically reformatted), I'm going to get a different number of lines back.
Let's take a look at ManyMoreLines.ps1 (once again, it's been written with Allman style, just to increase the line count).
If we simply count the lines of code (skipping only comments and actually blank lines), there are 39 lines of code. Our counter script from before will turn up only 37, because it turns out that it also doesn't count the default
term on line 45. Here's the counter script again, cleaned up to output the count, and a list of the not counted lines, so we can look and see why:
$Path = Convert-Path "ManyMoreLines.ps1"
$Ast = [System.Management.Automation.Language.Parser]::ParseFile($Path, [ref]$Null, [ref]$Null)
"Lines of Code: " + ($Ast.FindAll({$Args[0] -ne $Ast}, $true).Extent | % { $_.StartLineNumber, $_.EndLineNumber } | Sort -Unique -ov Counted).Count
"Skipped: " + (1..$($Counted[-1])).Where{ $_ -notin $Counted}
Lines of Code: 37
Skipped: 1 4 7 10 13 17 23 28 33 38 41 44 45 47
The number 37 somewhat over-represents the amount of code that's actually in this script, and the author is getting the blame for a lot of extra lines just because they chose to write in Allman style.
Remember: when I reformat the file as One True Brace Style (OTBS), the only thing I'm removing is the extra newlines around braces. If lines of code is supposed to be a metric for how complicated the code is, we don't want two different answers depending on your code formatting choices...
All we would have to do is call PSScriptAnalyzer to reformat the file, and then we could count as usual. Of course, we couldn't output the list of lines anymore, because we changed them before counting, but this script does return 29, which is the same as the earlier script returns on my reformatted OTBSLines.ps1
script.
$Path = Convert-Path "ManyMoreLines.ps1"
$Reformatted = Invoke-Formatter (Get-Content $Path -Raw) -Settings CodeFormattingOTBS
$Ast = [System.Management.Automation.Language.Parser]::ParseInput($Reformatted, $Path, [ref]$Null, [ref]$Null)
"Lines of Code: " + ($Ast.FindAll({$Args[0] -ne $Ast}, $true).Extent | % { $_.StartLineNumber, $_.EndLineNumber } | Sort -Unique -ov Counted).Count
Remember that we're using lines of code as one measurement of complexity, so having more lines is bad.
There are a few arguments against counting closing braces:
- They are there for style reasons. We could put them on the end of the previous line, but we want them on a new line. We want to make sure authors don't feel penalized for putting them there.
- They are a language artifact. In many languages, we don't use braces for indenting, and even in C# we can frequently leave braces off. To make comparisons across languages easier, and to avoid incentivizing C# developers to leave them off, we should not count them.
- They are a symptom of cyclomatic complexity. We are measuring cyclomatic complexity separately. We don't need to include it here, because when it stands on it's own, it has a much bigger impact.
This is really a personal preference, but I don't want to count lines that consist of nothing but a closing brace. Luckily, it's easy to leave them out of the count in the script above. In fact, it dramatically cleans up our code:
$Path = Convert-Path "ManyMoreLines.ps1"
$Reformatted = Invoke-Formatter (Get-Content $Path -Raw) -Settings CodeFormattingOTBS
$Ast = [System.Management.Automation.Language.Parser]::ParseInput($Reformatted, $Path, [ref]$Null, [ref]$Null)
"Lines of Code: " + ($Ast.FindAll({$Args[0] -ne $Ast}, $true).Extent.StartLineNumber | Sort -Unique).Count
This will give us 20 for the ManyMoreLines script, which is basically the number that I want.
I feel obliged to point out that it's possible for an author to wrap each of the case statements to a single line, compacting that code to look like this:
switch ($number) {
1 { "One" }
2 { "Two" }
3 { "Three" }
4 { "Four" }
5 { "Five" } # Never actually happens
default { "Unknown! $_" } # Deffinitely never happens
}
This does not actually change the script, but because of a caveat in the default OTBS rules that ship with PSScriptAnalyzer, those cases won't be unfolded onto three lines, so your line count would, in fact, go down to 14 (from 20). I'm inclined to think of this as bening. It only works when the code in question fits on one line, and I would encourage you only to do it when doing so makes it more readable, but in this example, I feel that the code is, in fact, measurably easier to read and follow written that way, so I'm not going to quibble about the reduction in line numbers. However, there is an easy fix.
If we wanted to ensure that the case
and the action
are always counted as two lines (and to guarantee that we always get the exact same line count for the code), you just need to set the PSPlaceOpenBrace
rule to IgnoreOneLineBlock = $false
in your CodeFormattingOTBS.psd1
configuration. You could even do that by pasting the whole contents of that PSD1 file into the call to Invoke-Formatter.
Since it's simple to refold everything and yields a much more consistent count, I'm going to do that. I'll include an option to count the closing braces (it may be useful to use the difference in the counts with, and without, as a standin for complexity if you're not using something else). You can see my final Measure-Script
file with inline reformatting rules below, and you can install it from the PowerShell Gallery with:
Install-Script Measure-Script
P.S. Yes, I know about PSCodeHealth
Did you know there's a module out there for this already? Actually some frustration about the numbers which that module produces for line counts is what spurred me to to write this script.