I'd like the focus on the following:
- Choosing Method of Mutation
- Equivalent Mutants
- Test Selection
- Infinite Loop Detection
- Unit Test Framework Support
- Visual Studio Integration
- .NET / .NET Core Support
- Performance
- Online Submissions
So there are two methods of mutation:
- ByteCode / IL Mutation
- Source Code / AST Mutation via 'Mutant Schemata'
With IL mutation we take a copy of the raw assembly and seed one small mutation in that assembly. Each new mutation precipitates a new assembly. We then run the tests.
Only IL specific mutations. C# code with no direct IL mapping cannot be mutated. This can be deemed a disadvantage as it limits the number of different types of mutation that can be performed.
However, there is a standard suite of mutation operators which have been researched, understood and thoroughly used. The are good especially for generating a low yield of 'Equivalent Mutants' (see below).
Research has been undertaken into mutants that are specific to C# code. One example of which is removing 'this.' from the front of member variables. This specific mutant has a high probability of generating an equivalent mutant.
A mutation testing framework that uses IL mutation can be used by any .NET language. Whilst C# covers most of the ecosystem, there is still plenty of use for VB, although a recent statement by Mads Torgersen suggests that MD are starting to distance themselves from it.
That said, there's plenty of C++.NET out there, and F# seems to be gaining a lot more traction. It's in the F# space in particular that I think we're seeing a lot of innovation. It's heavy use in Fintech and actuarial modelling itself almost prescribes the need for robust unit tests.
Consider also, the philosophical point of what this tools is trying to promote: High quality code on the .NET platform. And, of course be prepared for the endless requests from other language users for this feature.
Henry commented on the potential for Junk mutants. I get a sense that might be an even bigger problem with .NET, though I'm not suitable placed in my knowledge of Java to provide an authoritative synopsis of this. Two areas that spring to mind in this aspect are:
- Generics - Java uses 'Type Erasure' to realise this. C# uses 'Reification'. Might this be a source of more junk mutants?
- 'dynamic' - Under the hood, dynamic types get converted into IL that uses reflection. Having looked into the IL for dynamically types in comparison with static ones, I've observed at times 30 extra lines of IL to achieve this same result.
- async/await - I believe under the hood this is realised using a 'continuous passing style' pattern implementation. This must generate a lot of extra IL.
- 'yield' - Under the hood, I think the precipirates an iterator-style GOF pattern.
With this method we mutate the abstract syntax tree. It could just form an alternative approach to the IL mutation, but this opens the door to the use of a mutant schemata to realise the solution.
As has already been mooted, Roslyn would appear to be tailor made for this.
In contrasting the two different ones, I came up with the following. I'm happy to be contradicted or to have different contrasts highlighted.
IL | AST | |
---|---|---|
Language support/Reach | All - Language Agnostic | Limited - Targeted Language(s) only |
Rel Performance | Slower (maybe) | Faster (maybe) |
Different Skillsets | CIL, Reflection Emit/Mono Cecil | Roslyn & AST. |
Specific challenges | Grokking IL and libraries (reflection emit/mono cecil) | Seeding of conditionals in mutant schemata |
Data Space | Larger (maybe) | Smaller (maybe) |
Junk mutants | Maybe | No |
Common skillsets are:
- C# .NET
- Unit test frameworks - at a minimum NUnit, XUnit, MSTest
- Assembly manipulation libraries (reflection emit, mono cecil)
- Visual studio plugin libraries
From the discussions, folk have seemed to settle upon the AST mutation method without (in my opinion) giving enough consideration to the IL mutation method.
If due diligence determines it's the best method, then, fine, but I'd like it not to be the preferred method for any of the following considerations:
- Fear of IL
- Shiny new technology in Roslyn.
- Blinkered commitment to C#
These are mutants which have identical functionality to the original core code, and therefore can never generate a failing unit test.
if ( i >= 1 ) {
return "foo";
}
//...
int i = 2;
if ( i > 1 ) {
return "foo";
}
There is no reliable way of detecting equivalent mutants.
As Henry has already noted this is also needed. I have undertaken no research in this area, however would be surprised if this was not already a 'solved problem'.
As Henry also noted, test selection is critical, and also noted by him was the use of code coverage to achieve this.
Modern versions of visual studio have intellisense to determine which functions map to which test. I wonder if this is has an API that can be called.
However, my gut feel is that this is something that is going to have to be hand-rolled as well, as it dovetails with the code mutation a little. We already know the point at which we are mutating. So a good approach would be to identify all the mutation points and then seed code coverage recording at those points.
Note, that this implies that the mutations we have to perform are potentially threefold:
- Function mutation for mutation test.
- Insertion of code coverage recording
- Insertion of schemata mutation switch (if AST/schemata method is adopted)
When I first looked at doing this, it became apparent that there would be a need to directly call unit tests via a publicly exposed API. There are many frameworks out there, but the following I think encompass the a good share of what's used:
- NUnit
- XUnit
- MSTest
Bear in mind that in developing a mutation testing system we have to explicitly develop to support a unit test framework.
Say what you like about MSTest, but it's still well used, and I think a necessity to support. And there lies the problem - last time I checked it didn't have a public API. No public API, no way to access test results easily programmatically.
Why's this important? Well, when we run a cycle of mutation tests we want to do the following programmatically:
- Select and run the tests that cover the mutant
- Read the results to check we have a failing test
It's not an insurmountable problem, but it will complicate things. Two years I blogged about this for this very reason:
www.jameswiseman.com/blog/2015/10/13/microsoft-pleeease-expose-your-mstest-api/
It might have changed, of course, I haven't checked.
Whilst early implementations need not consider it, we should be mindful of the ability of integrating into visual studio. I'll probably have to be exposed a bespoke 'TestRunner', and should have a public API that can be utilised seamlessly for anyone wanting to develop a plugin.
This is somewhat a moot point for the AST method, and, as it turns out for IL mutation also. .NET core has the same IL as .NET not-core.
https://stackoverflow.com/questions/34906969/does-net-core-generate-the-same-il-as-standard-net
Mutation testing is by nature an expensive business from a resource perspective, so we'll have to use every trick in the book to mitigate this when writing the code that implements the mutation system. This includes things like:
- Minimal use of reflection. This means that code architecture approaches like DI should probably be avoided, and any use of 'dynamic' types limited/omitted.
- Minimal use of dynamic - this is converted to much reflection under the hood.
- Minimal use of unmanaged resources.
- Careful use of 'syntactic sugar' features. E.g. auto-properties, async/await, generics, etc.
This is something down the line, but it might be nice to have. Consider a cloud-based paid service where people can submit their code and have it analysed for free. For those with large code bases, leveraging the scalability of the cloud might be quite nice.
Contrasting IL and AST:
I imagined:
Just to get a feel for overheads.