Skip to content

Instantly share code, notes, and snippets.

Last active November 18, 2018 10:56
Show Gist options
  • Save vrthra/aa7527ee5c6085bb9124d06a7f24c662 to your computer and use it in GitHub Desktop.
Save vrthra/aa7527ee5c6085bb9124d06a7f24c662 to your computer and use it in GitHub Desktop.

Rehabilitating Mutant Immortals.

One often hears the claim that mutation analysis is unreliable because of the presence of immortal (equivalent) mutants. The problem is that the true mutation score is the ratio between the number of mutants killed and the actual number of mortal mutants. Since there is no general algorithm to determine all immortal mutants from a set of mutants, it is impossible to determine the actual number of mortal mutants from an arbitrary set of mutants. Hence, the mutation score is unreliable as a measure of test suite quality, which is the primary purpose of mutation testing.

Further, live mutants are often used by practitioners to determine how to enhance their test cases so that it covers more specification by killing them. Immortal mutants, by definition can not help here because they can not be killed. Hence, human effort spent on analyzing these mutants is often considered a waste.

Useful Immortal Patterns

Our thesis is that one needs to consider the issue from a more holistic perspective. Mutation analysis should not be considered as a way to improve tests alone, rather, it should be regarded as an opportunity to improve both the test cases, and the code that it tests. We present several patterns where code that produced equivalent mutants were improved by refactoring or rewriting that eliminated equivalent mutants, which resulted in better code. Our experience suggests that these patterns are common enough that human analysis of live mutants -- whether they are immortal or not -- can be worth the investment for a practitioner. We study several patterns below

Deletion of unreachable code

Unreachable code represents one of the simplest instances of equivalent mutants that are relatively easy to identify. These can often be identified statically and eliminated. We are talking about code that can't be identified statically. The problem with unreachable code is that such code often represents a cognitive load for a new programmer. The programmer has to understand how the code behaves, because any set of statements represent a vulnerability surface if an attacker is able to make them execute. Hence, removing unreachable code can often simplify the codebase, and make it easier to understand and maintain.

Deletion of code that does nothing

Another pattern that one often find is code that performs some computation, where the results are discarded at the end. While it may be that the computation may have been useful in the past (or may be useful in the future), performing it incurs a penalty in the computational resources required. Further, such code still represents code that a programmer needs to understand, and hence contributes to the cognitive load. Hence, removing such code can make the code better performant and easier to maintain.

What about optimization branches

A common pattern that is seen in performance oriented programs is optimization of certain often occuring conditions. For example, optimization of multiplication of two by shifts. This optimization can take place only if the multiplier is 2. Hence, a program that checks and optimizes for multiplication by 2 will have an immortal mutant in the program that disables that optimization. Here, while by usual definitions, both are equivalent programs, we argue that the performance difference should have been noticeable, and verified by a test case. Otherwise, one is relying on knowledge of conditions that is not codified anywhere, and may be subject to change (e.g y2k). Hence, we argue that optimization immortals are not true immortals, and inspecting them can be worth the effort spent.

Functional idoms

One often finds that one of the ways to eliminate immortal mutants is to rewrite an imperative style function in a functional idiom. This often reduces the quantity of the code involved, reducing both mortal and immortal mutants. The more concise code from the functional idioms require using functions that are harder to replace exactly (because each function does more), and hence result in less immortals. Here again, the equivalent mutants are an opportunity to improve the code

Why smaller number of mutants are better

Number of mutants represent the complexity of a piece of code. A reduction in the number of mutants represent a reduction in the complexity because there is less opportunity to go wrong (vague -- needs to rethink).

Code Duplication

A common reason for a large number of immortal mutants is duplicated code in the code base. If the original code contained immortal mutants, duplication of it will multiply the number of immortals. Since duplicated code is often indicative of poor code, immortals can often serve as a powerful signal for code quality.

Useful References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment