RocketPuppy/types_AND_tests.md

## types_AND_tests.md

      
    Raw
  

              types_AND_tests.md
            
          
    Uncle Bob, you recently wrote a post titled Types And Tests.
It generated quite a lot of buzz. It was also, in my opinion, a bit naive
regarding the current state of affairs in type systems. This made it difficult
for me to understand exactly what your points were and I would like to understand
those better. To help clarify those I'm going to provide counter examples to
your claims and propose other questions that both might help me better understand
what you mean. Hopefully someone else finds this useful as well. I'll be using
Ruby and Haskell for most code snippets as they are representative of the two
sides and I'm most familiar with them right now.


No, types are not tests. Type systems are not tests. Type checking is not
testing. Here’s why.

This quote wasn't really a comprehensive point, but I want to take it as an
opportunity to point out the difference between types and tests to my
understanding. After all it's important to make sure that we're arguing from the
same base, otherwise we'll just talk past each other.
I agree that types are not tests. They both are tools used to provide guarantees
and place constraints on how code can change. The primary difference is in how
they do this. A test takes a single, representative example, initializes the
system or performs an operation with it, and then checks that the final state
conforms to a set of assertions. Here's a trivial test:
describe 'add' do
  it 'returns a number when given two numbers' do
    expect((1 + 2).class).to be Number
  end
end
The test says that for the numbers 1 and 2, when I add them together I get a
number back. This is sometimes a reasonable test to write, especially in the
presence of languages like Javascript which will do implicit coercion from
numeric values to string values. Notice that this doesn't say anything about
what happens if I pass two different numbers to +, or if I pass a number and a
string.
Where a test speaks about specific, representative examples, a type speaks about
sets of values with certain properties. A type can speak about all numbers, or
all strings, whereas a test cannot because those sets of values are infinite.
A type can also speak about all Foo, where Foo is a domain object in our
domain. This immediately lets us make much broader constraints that make our code
safer to change. Here's an example type:
add :: Num a => a -> a -> a
Here this says that the add function must take two Numeric values (these can
be integers, floats, etc. as long as they are both the same type) and return a
Numeric value of the same type. That is, this will add floats, integers,
rationals, etc. together but won't add two values of different types. This is
different from the Ruby version, which will do some implicit casting. It  is
primarily just a design choice to not do any implicit casting of numeric values,
and not necessarily a shortcoming of the type system (see
this article on the Haskell
wiki).


Types do not specify behavior. Types are constraints placed, by the
programmer, upon the textual elements of the program. Those constraints reduce
the number of ways that different parts of the program text can refer to each
other.

This is the paragraph that I, and probably many others, found most
objectionable. In the second part I think you're getting at the idea that type
checkers are static and all they can operate on is the text of the program at
rest. This is true. However they do not operate on the raw textual elements of
the program at all, but on the syntactical elements. The phrasing as written
seems to relegate type checkers to nothing more than glorified spell checkers,
which is far from their true nature.
The first portion of this paragraph I also found objectionable. It says "Types
do not specify behavior." I've heard this word "behavior" before in the context
of "Objects have behavior", but I haven't heard a consistent definition of it
yet. I'm curious what your definition is. I want to give an example of what I
find is a reasonable test to write and its equivalent type.
describe 'Order#purchase' do
  it "raises an error when the order isn't confirmed" do
    order = Order.new(price: 3.00, status: :unconfirmed)

    expect { order.purchase }.to raise UnconfirmedOrder
  end

  it "changes the status of the order to purchased" do
    order = Order.new(price: 3.00, status: :confirmed)

    expect(order.purchased?).to be false

    order.purchase

    expect(order.purchased?).to be true
  end
end
This is a test that specifies the behavior of some method purchase on some
object Order. It specifies two of what I consider behaviors of the method:
that only confirmed orders can be purchased, and purchasing a confirmed order
changes its status to be purchased.
Here's the corresponding type definitions in Haskell.
data Order status = Order { price :: Integer }
data PurchasedStatus
data ConfirmedStatus

purchase :: Order ConfirmedStatus -> Order PurchasedStatus
I've defined three datatypes here. The first is what an Order looks like. It
just has a price which is an Integer. It also is parameterized on a status
type.  The second two are empty types. That means you can't construct a value of
those types, but you can use it in a type signature as a tag. These are what I
use to parameterize Order in the signature of purchase. Read in the language
of types, the signature of purchase says two things. First it says that I
cannot call purchase with anything but an Order that has been confirmed. This
is equivalent to the first test above. The second thing the signature says is
that I can treat the Order that is returned as if it has been purchased. This
is equivalent to the second test above. Note again that the tests are only
guaranteed to apply a constraint for those specific orders that were created in
the test, whereas the types apply the constraint for all possible order
values. This technique used in the Haskell example, if you're interested, is
called "phantom types".
I'll give a fair shake to the Ruby type system in that it can also achieve some
of the same guarantees as we do with the Haskell type system. See:
class ConfirmedOrder
  def purchase
    # do magic
    return PurchasedOrder.new
  end
end

class PurchasedOrder
end
In the same manner that I said purchase can only be called on a confirmed order
in the Haskell code, now I can only call purchase on a confirmed order in the
Ruby code.
Note that I did not specify in either tests or types what exactly happens when
an order is purchased. It might be that a credit card is charged or an email is
sent. Those are both things that we would test regardless of the type system.
My question to you, Uncle Bob, is if this is an example of behavior that you
would test? If it is, I think you'll be forced to agree that types can specify
behavior, and types can replace some tests. If it is not I would like to
know why, and what behavior you would then test.


You might therefore say that the type system is a kind of “test” that fails
for all inappropriate invocations of f. I might concede that point except for
one thing – the way f is called has nothing to do with the required behavior
of the system. Rather it is a test of an arbitrary constraint imposed by the
programmer. A constraint that was likely over specified from the point of view
of the system requirements.

I'd like to ask what an inappropriate invocation of f is. For example, if f
is Order#purchase as above then is this an inappropriate invocation?
unconfirmed_order = Order.unconfirmed.find(params[:id])
unconfirmed_order.purchase
I think this is inappropriate, because I already established above that
Order#purchase should not be called on unconfirmed orders. If it is not
inappropriate, why is that? Let's assume it is inappropriate. Doesn't that
invocation impede the required behavior of the system? As another example, is
this inappropriate:
credit_card = CreditCard.new(number: nil, expiration: nil, name: nil, cvv: nil)
credit_card.charge(5.00)
Assume that the number, expiration, name and nil parameters are
required to make a charge. This seems totally inappropriate to me, and also
contrary to the desired behavior of the system.
Finally I'd like to propose that the constraints specified by tests and the
constraints specified by types overlap. So what makes the constraint specified
by a type any more arbitrary and over specified than the constraint specified by
a test?


Now this is no small thing. Writing program text that is internally consistent
is pretty important. Inconsistencies and ambiguities in program text can lead
to misbehaviors of all kinds. So I don’t want to downplay this at all.


On the other hand, internal self-consistency does not mean the program exhibits
the correct behavior. Behavior and self-consistency are orthogonal concepts.
Well behaved programs can be, and have been, written in languages with high
ambiguity and low internal consistency. Badly behaved programs have been written
in languages that are deeply self-consistent and tolerate few ambiguities.

There's a contradiction in these two paragraphs. First you state that
"inconsistencies and ambiguities in program text can lead to misbehaviors of all
kinds." Then you state that "Behavior and self-consistency are orthogonal
concepts." You can't have both. If they were truly orthogonal then
inconsistencies could not lead to any behavioral changes, let alone incorrect
ones! I suspect that you are actually speaking about two different perspectives
of behavior here: that of behavior specified by the programmer; and that of
behavior specified by the user. I want to be clear that these are two different
sets of behavior.
The programmer's job then is to translate the behavior specified by the user
into a program that has the same behavior. That translation cannot be helped by
types or tests because it happens before that step. However when it comes time
to write the program the programmer must specify the behavior they want to the
computer. This is where types and tests can help. A self-consistent specification
of behavior is one that will not misbehave. That is to say, it will do exactly
as specified. It's up to the programmer to specify it correctly.  Finally as to
the last part of the second paragraph, a language can help, hinder, or ignore
the programmer in writing a self-consistent program. If the programmer has a
consistent mental model of the program then it will itself be consistent
regardless of the language.


Clearly, every language chooses the latter option. No language forces the
program text to be absolutely unambiguous and self consistent. Indeed, such a
language would likely be impossible to create. And even if it weren’t, it
would likely be impossible to use. Absolute precision and consistency has
never been, nor should it ever be, the goal.

I'm unsure what you're trying to say here. It seems to me that one of the most
desirable properties in a programming language is to be unambiguous and
self-consistent. A language that does not have those properties would be
difficult to use because it would be difficult to predict. You can see these
properties being violated in undefined, implementation defined, or confusing and
unpredictable behavior in languages. It's why Javascript and PHP are so often
the target of jokes and frustration.


The problem with increasing the level of precision and internal self
consistency, is that it implies an increase in the number of constraints. But
constraints need to be specified; and specification requires notation.
Therefore as the number of constraints grows, so does the complexity of the
notation. The syntax and the semantics of a language grows as a function of
the internal self-consistency and specificity of the language.

This paragraph made me think you were arguing against specific syntax for
specifying types, not against static typing itself. Which one is it?  There are
strong, statically typed languages with minimal syntax for types.


Do languages like this exist? No; but the more type-safe a language is, the
more internally consistent and specific it forces the programmers to be, the
more it approaches that ultimate TDP condition.

What do you mean by "internally consistent and specific" here? If you mean that
every type must be specified precisely as a primitive or user defined type then
I'd have to say this paragraph is incorrect. Polymorphism can be worked into a
strongly type-safe language and allows programmers to be generic to the point of
ridiculousness. Haskell is living proof of this.
There's a relevant digression here in that when you make a function or method
more generic you must also make its implementation more specific. More generic
means fewer assumptions and with fewer assumptions to rely on there are less
ways to implement a solution. This can be counter-intuitive, but an example
might help. Consider the following functions:
id :: a -> a

not :: Bool -> Bool

foo :: Int -> Int
The id function must be able to operate on any value, and return any value.
That is to say it cannot make any assumptions about its inputs or outputs.
Therefore it cannot do anything but return its input, unchanged. It has one
possible implementation:
id x = x
The not function takes a boolean, which I know must be one of two values. It
also returns a boolean. I can easily enumerate all the possible
implementations of not:
-- Implementation 1
not True = True
not False = False

-- Implementation 2
not True = False
not False = True

-- Implementation 3
not True = True
not False = True

-- Implementation 4
not True = False
not False = False
The foo function operates on integers which I know are infinitely many.
Therefore there are infinitely many possible implementations of foo.
To bring it back to the point at hand, a type-safe language does not mean a
specific language. Polymorphism is vitally important, and perhaps more useful in
a strong type-safe language. It allows for the observation of assumptions (or
the lack thereof) which allows for fewer possible implementations, less
ways for functions to go wrong, and thus less required test cases.


Every step down that path increases the difficulty of using and maintaining
the language. Every step down that path forces users of the language to get
their type models “right” up front; because changing them later is too
expensive. Every step down that path forces us back into the regime of Big
Design Up Front.

In what way are you saying languages get more expensive? If you mean that a
single change requires more subsequent changes to remain consistent then I'll
agree with you. However every place where those changes need to be made is
pointed out to you by the compiler. So the programmer has to spend very little
time finding those places. Self-consistency must be maintained in general, so
those changes would need to made whether or not the language forced you to make
them. I know I would rather free up that mental capacity for more important
work.

This was quite a long rebuttal, and I've made quite a few points along the way.
I'll summarize them here:

Types can specify behavior
Types can provide stronger guarantees than tests
Strong, statically-typed languages can exist with minimal syntax for types
Self-consistency is important for programmer specified behavior to be correct,
but it is not enough to determine if user-specified behavior is correct.
Types are not a replacement for all tests, and both can be used together to
create something more than the sum of their parts.
A static type-checker makes a cost trade-off, trading off mental energy to
maintain self-consistency for manual work to maintain self-consistency
A program text is desired to be unambiguous and consistent in order to allow
the programmer to predict its behavior.
Finally, types are not a replacement for tests (yet), but they can be used to
dramatically reduce the number of tests required to have confidence in a
system.