When I first started looking at ML-style type signatures, I didn't realize the amount of hidden information is contained within them. But by simply asking a few simple questions and remembering that the functions are pure, you can surmise a lot about the internal workings of the function without ever looking at a single line of code.
Let's start off with a simple signature:
abstract : a -> a
abstract
takes something of type a
and returns something of type a
.
So what possible implementations could there be for this. Well, there can be no operation that can be safely done on ANY type since we don't know what type we have at any point. Our functions are pure, so no side effects are allowed. So all this function can do is return whatever was passed in unchanged.
All functions with this signature are equivalent.
Similarly:
abstract2 : a -> b
abstract2
takes an a
and returns a b
.
This is a trick question. This signature is invalid. To understand why, imagine that you're going to write this function. How would you do it?
All you can do is return a
unchanged which means the signature should be the same as abstract
or you can return a constant. In that case, b
should be the type of the constant.
The Type System will complain about this signature depending on your implementation.
Next consider:
abstract3 : number -> number
abstract3
takes a number
and returns a number
. Well, now we know that we can do certain operations on the input, e.g. addition, subtraction, etc. And since numbers are closed under these operations, we can return the result and still return something of type number
.
But we can only do binary operations on the input with a constant. We may add 1 or subtract 10 or multiply by 53.32 or divide by -17. But a lack of a second parameter means that we have to hardcode a constant for the second parameter of a binary operation.
Unary operations, e.g. negate, don't require a second parameter, but they too have their binary equivalent, e.g. a negation is just multiplication by -1. So we are equally restricted.
So abstract3
can only return the number unchanged or perform an operation that is closed, i.e. takes a number or numbers and returns a number. And the other parameters of the operation must be constants.
Let's look at something a little more common.
Take:
common : a -> (a -> b) -> b
common
takes an a
and a function and returns a b
. But not just any function. The function takes an a
and returns a b
.
The function that converts an a
to a b
is what was missing from abstract2
. abstract2
had no way of converting an a
to a b
. But that's not the case here.
At some point, common
must use the function to convert the a
to a b
. Before it converts a
, it can manipulate a
in the same ways that abstract
can. But after that, it must convert it to a b
.
After it converts to a b
, it can manipulate b
in the same ways that abstract
can. But after that, it cannot do any more and must return its results.
Next consider:
common2 : a -> b -> (a -> c) -> (b -> d) -> Thing c d
Using similar logical, we can deduce that common2
uses the (a -> c)
function to convert the a
to a c
and the (b -> d)
function to convert the b
to a d
.
It then wraps it up in a Thing
using the converted a
, now c
, as the first parameter and the converted b
, now d
, as the second.
The usual single type manipulation can occur before and after the conversions, but not much else.
Let's look at signatures with concrete types, e.g. String and Int.
concrete : String -> String
concrete
can do many things to this string. It can uppercase it. It can reverse it. It can return a constant string.
It can also take a portion of the string. For example, it can take the first 5 characters of the string and return them. But that portion must be a constant index and length.
What about:
concrete2 : String -> Int
Notice that concrete2
is very similar to abstract2
. Where abstract2
is invalid, concrete2
is not since the types are known and therefore the function implementor can do operations on those types.
There are only a few reasonable implementations for concrete2
that come to mind:
- count occurrences of all or certain characters or strings
- convert the string to its numeric equivalent
- calculate the string's hash
This pure function really cannot do much more with the String
such that it returns an Int
.
More possibilities with:
concrete3 : Int -> String -> String
There are only so many ways in which the Int
parameter can be used relative to the String
parameter:
- as an index into the string
- as a count of characters or substrings
- as a value to be converted to a string
If concrete3
uses its Int
parameter in these ways, then it's reasonable to assume that the function adds or subtracts from the String
parameter in some defined way.
That's about the best we can do without knowing more.
Simple Abstract Type Signatures can tell us plenty because they are very restrictive. And as they become more concrete, the number of possible implementations grows rapidly.
But in practice, we can surmise much of the implementation details from just the signature alone. This can be quite helpful when you're reading less than desirable documentation, which is almost always.