Created
April 19, 2018 12:15
-
-
Save NorfairKing/86442e7e149a65a90c2e5951fbaa1386 to your computer and use it in GitHub Desktop.
Thoughts about Algebra for QA
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've been thinking about where one would get started to find problems in code. | |
Ideally I'd come up with some mathematical properties to spot that would be | |
likely to indicate that bugs are more common. | |
Part 1: Possible implementations | |
One idea I've had stems from algebra. It starts with the concept of cardinality. | |
Out of all possible implementations of a solution to a certain problem, there | |
are only few of them that correctly solve the problem. | |
We already know about cardinality of types, so let me illustrate my point with | |
those. | |
If `A` and `B` are types of size `a` and `b`, then `Either A B` has $a + b$ | |
elements (sum type). `(A,B)` has $a * b$ elements (product type). Most | |
significantly: `A -> B` has $b ^ a$ elements. | |
This is, in my estimation, the root cause of bad software: there are | |
exponentially many ways to write the wrong code with respect to the size of the | |
types/problem you are trying to solve. | |
If we could somehow eliminate all the wrong implementations from having a | |
possibilty of existing, then we could increase the likelihood that the code we | |
write is correct. | |
So how do we constrain this number of possible implementations such that only | |
correct implementations remain? | |
The first step would be using correct (small) types. Haskellers love this | |
approach. If you don't allow `null` in the input type of a function, and | |
enforce that constraint with a compiler, then you automatically rule out | |
$(b - 1)$ times the number of remaining implementations. | |
($b ^ (a + 1) - b ^ a = (b-1) b ^ a) | |
To give you an example, the number of possible implementations of a function of | |
type `Ordering -> Ordering` is $27$ whereas the number of possible | |
implementations of a function of type `Maybe Ordering -> Ordering` is `81`. | |
If you don't allow `null` in the output type, and enforce that with a | |
compiler, then you automatically rule out $(b + 1) ^ a - b ^ a$ possible | |
implementations. | |
To go back to the previous example, that's $64$ possible implementations for a | |
function of type `Ordering -> Maybe Ordering`. | |
Remember: for large types this difference is much much bigger. | |
The next step would be to use parametric types. | |
The number of possible implementations of `Maybe Bool -> Maybe Bool` is $27$. | |
This is already a very large number considering that `Bool` is a very small | |
type. | |
The number of possible implementations of `Maybe a -> Maybe a` is only $2$! | |
You can either have `id :: Maybe a -> Maybe a` or `const Nothing :: Maybe a -> | |
Maybe a`. | |
This is because the implementation of a function of this type has to be able to | |
deal with the fact that it cannot inspect the value of the type that `a` will | |
be at a use-site. `Nothing` can never be mapped to anything other than | |
`Nothing`, because to create a `Just` value, you need a value of type `a`. | |
`Just a` can never inspect the value `a` to see what to do next. It must either | |
ignore the value and map to `Nothing`, or it must map to `Just a`. | |
If you look at the previous two paragraphs, you can also immediately see why | |
Python programs tend to be broken. There are about two total 'function's in | |
Python: The function that does nothing, and the identity function. | |
All the others will crash for some input. | |
When you make a function as parametric as possible, you could still be left with | |
an infinite number of possible implementations. | |
Take for example the reverse function. It has type `[a] -> [a]`. | |
Because `[a]` could inhabit a infinite number of values (say when `a ~ ()`), | |
there could be an infinite number of implementations of this function. | |
(You may think that there would be an uncountably infinite number of | |
implementations of this function, but the total number of possible programs is | |
already countable. Not that this really matters.) | |
So we need to find ways to constrain the number of possible implementations | |
further. This is where we turn to tests. | |
If we write a test to specify that `reverse []` must evaluate to `[]`, then we | |
can use this test to enforce that the implementation that we end up with will | |
satisfy this limitation. `NonEmpty a` is not any smaller than `[a]`, so this | |
test will not constrain the number of implementations. | |
HOWEVER, not every implementation is equally likely to be written. | |
A 5000 line implementation is less likely to be written than a 2 line | |
implementation. An implementation that deals with `[]` incorrectly is quite | |
likely, so the `reverse []` test rules out many wrong implementations that are | |
likely to be written. | |
The safety model of a unit test usually assumes that the programmer will not | |
make certain obscure mistakes. For example, when testing the `reverse` function | |
, unit tests would likely not test specifically for `Deadlock` exceptions to | |
be thrown. We assume that programmers may get the `reverse` function wrong, but | |
not that they will get it THAT wrong. | |
This is why unit tests are useful: to rule out many (wrong) implementations | |
that are not unlikely to be written. | |
We can go even further: | |
With property tests, we can constrain the possible implementations to the | |
implementations that satisfy certain properties. There is no hard rule for how | |
many (wrong) implementations you can rule with a property, but properties tend | |
to work very well for this purpose anyway, because they relate to what the | |
programmer means by `correct`. | |
Back to the `reverse` example: If we add a property test that specifies that | |
`reverse (reverse xs)` must equal `xs` for _any_ generated `xs`, we don't | |
constrain the number of possible implementations per-se. | |
However, we greatly constrain the number of possible likely implementations. | |
Indeed, the most likely implementation that we haven't ruled out is `id :: [a] | |
-> [a]`, but the next is the following: | |
``` | |
reverse :: [a] -> [a] | |
reverse xs | |
| odd (length xs) = xs | |
| otherwise = case xs of | |
[] -> [] | |
[_] -> xs -- Never called | |
(x:y:xs') -> y:x:reverse xs' | |
``` | |
This implementation is significantly larger than the correct implementation. | |
The only implementation we have left to rule out is `id :: [a] -> [a]`. | |
Then we have ruled out all wrong implementations that are more likely to be | |
written than the most likely correct implementation. | |
In the case of `reverse`, we can do this with either a single unit test or a | |
comprehensive property test. | |
If we add `reverse [1,4,9] == [9,4,1]`, it becomes _very_ hard to write an | |
implementation that passes these three tests but is still wrong. | |
We could also add the following property: | |
``` | |
reverse (x:xs) = reverse xs ++ [x] | |
``` | |
This property is very significant because it "pins down" the implementations | |
that satisfy it. By itself, it already constrains the possible implementations | |
to only the correct implementation. | |
It would be great if we could have such a property test for every function, but | |
writing this property test requires very deep understanding of what `reverse` | |
means so I would prefer not to assume that we can always do that. | |
This is why I mention this property test last. | |
The strategy of constraining possible implementations also leads to the | |
discovery of a natural stopping point for QA: when we cannot think of any | |
alternative implementations that pass all tests but are still wrong. | |
Part 2: Common mistakes. | |
In the previous section, I completely ignored `undefined :: a` or `unsafeCoerce | |
:: a -> b`. | |
However, these kinds of functions raise a valid concern: given that partial | |
functions exist, do we need to do something different? | |
First: by evaluating a function with _any_ input, we rule out the `undefined` | |
implementation. | |
By evaluating a function with _many random inputs_ (and evaluting the result), | |
we can (probabilistically) rule out many partial implementations. | |
Not every partial code path is equally likely to be taken or to exist. | |
However, we can use a bit of algebra again to predict likely occurrences of | |
partiality. | |
Let's look at a function of type `A -> B` again. Traditional modern maths has | |
defined this to mean that for every `a :: A`, there exists a `b :: B`, but in | |
computer science this is nonsense of course. something something halting | |
problem. But how do we find these `a :: A` values for which there is no such `b | |
:: B` but a crash instead? | |
*This is where I need your help, I have found only one obvious way, but I would | |
be very interested in finding more.* | |
Consider a function `f` of type `A -> B` such that we expect there to be an | |
function `g` of type `B -> A` such that `g (f x)` equals `x` for every `x :: A`. | |
For this to be possible at all, mathematics says that `f` and `g` must be | |
bijective. However, in computer science `f` only has to be injective. | |
Now, we know many examples of such `f` and `g` where `f` is injective and `g` is | |
partial: `parseJSON` and `toJSON` form a nice example. | |
My claim is that whenever you have a pair of functions `f :: A -> B` and | |
`g :: B -> A` where `f` is injective, it is certainly interesting to investigate | |
`g` further. | |
(You can easily see that `f` is injective if we expect a `g` to exist and `B` | |
is larger than (or equal to) `A`.) | |
This is because `g` *must now necessarily be partial*. | |
Deserialisation functions should solve this problem by evaluating to values of | |
type `Parser A` instead of `A` so that they can naturally handle failure. | |
However, we typically find that programmers make assumptions about what the | |
values of type `B` look like when these values are being parsed and these | |
assumptions cause typical mistakes. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment