Skip to content

Instantly share code, notes, and snippets.

@RocketPuppy
Created January 24, 2017 03:26
Show Gist options
  • Save RocketPuppy/52e0f5b22eb62a4587b5054a31d0b3ad to your computer and use it in GitHub Desktop.
Save RocketPuppy/52e0f5b22eb62a4587b5054a31d0b3ad to your computer and use it in GitHub Desktop.

Uncle Bob, you recently wrote a post titled Types And Tests. It generated quite a lot of buzz. It was also, in my opinion, a bit naive regarding the current state of affairs in type systems. This made it difficult for me to understand exactly what your points were and I would like to understand those better. To help clarify those I'm going to provide counter examples to your claims and propose other questions that both might help me better understand what you mean. Hopefully someone else finds this useful as well. I'll be using Ruby and Haskell for most code snippets as they are representative of the two sides and I'm most familiar with them right now.


No, types are not tests. Type systems are not tests. Type checking is not testing. Here’s why.

This quote wasn't really a comprehensive point, but I want to take it as an opportunity to point out the difference between types and tests to my understanding. After all it's important to make sure that we're arguing from the same base, otherwise we'll just talk past each other.

I agree that types are not tests. They both are tools used to provide guarantees and place constraints on how code can change. The primary difference is in how they do this. A test takes a single, representative example, initializes the system or performs an operation with it, and then checks that the final state conforms to a set of assertions. Here's a trivial test:

describe 'add' do
  it 'returns a number when given two numbers' do
    expect((1 + 2).class).to be Number
  end
end

The test says that for the numbers 1 and 2, when I add them together I get a number back. This is sometimes a reasonable test to write, especially in the presence of languages like Javascript which will do implicit coercion from numeric values to string values. Notice that this doesn't say anything about what happens if I pass two different numbers to +, or if I pass a number and a string.

Where a test speaks about specific, representative examples, a type speaks about sets of values with certain properties. A type can speak about all numbers, or all strings, whereas a test cannot because those sets of values are infinite. A type can also speak about all Foo, where Foo is a domain object in our domain. This immediately lets us make much broader constraints that make our code safer to change. Here's an example type:

add :: Num a => a -> a -> a

Here this says that the add function must take two Numeric values (these can be integers, floats, etc. as long as they are both the same type) and return a Numeric value of the same type. That is, this will add floats, integers, rationals, etc. together but won't add two values of different types. This is different from the Ruby version, which will do some implicit casting. It is primarily just a design choice to not do any implicit casting of numeric values, and not necessarily a shortcoming of the type system (see this article on the Haskell wiki).


Types do not specify behavior. Types are constraints placed, by the programmer, upon the textual elements of the program. Those constraints reduce the number of ways that different parts of the program text can refer to each other.

This is the paragraph that I, and probably many others, found most objectionable. In the second part I think you're getting at the idea that type checkers are static and all they can operate on is the text of the program at rest. This is true. However they do not operate on the raw textual elements of the program at all, but on the syntactical elements. The phrasing as written seems to relegate type checkers to nothing more than glorified spell checkers, which is far from their true nature.

The first portion of this paragraph I also found objectionable. It says "Types do not specify behavior." I've heard this word "behavior" before in the context of "Objects have behavior", but I haven't heard a consistent definition of it yet. I'm curious what your definition is. I want to give an example of what I find is a reasonable test to write and its equivalent type.

describe 'Order#purchase' do
  it "raises an error when the order isn't confirmed" do
    order = Order.new(price: 3.00, status: :unconfirmed)

    expect { order.purchase }.to raise UnconfirmedOrder
  end

  it "changes the status of the order to purchased" do
    order = Order.new(price: 3.00, status: :confirmed)

    expect(order.purchased?).to be false

    order.purchase

    expect(order.purchased?).to be true
  end
end

This is a test that specifies the behavior of some method purchase on some object Order. It specifies two of what I consider behaviors of the method: that only confirmed orders can be purchased, and purchasing a confirmed order changes its status to be purchased.

Here's the corresponding type definitions in Haskell.

data Order status = Order { price :: Integer }
data PurchasedStatus
data ConfirmedStatus

purchase :: Order ConfirmedStatus -> Order PurchasedStatus

I've defined three datatypes here. The first is what an Order looks like. It just has a price which is an Integer. It also is parameterized on a status type. The second two are empty types. That means you can't construct a value of those types, but you can use it in a type signature as a tag. These are what I use to parameterize Order in the signature of purchase. Read in the language of types, the signature of purchase says two things. First it says that I cannot call purchase with anything but an Order that has been confirmed. This is equivalent to the first test above. The second thing the signature says is that I can treat the Order that is returned as if it has been purchased. This is equivalent to the second test above. Note again that the tests are only guaranteed to apply a constraint for those specific orders that were created in the test, whereas the types apply the constraint for all possible order values. This technique used in the Haskell example, if you're interested, is called "phantom types".

I'll give a fair shake to the Ruby type system in that it can also achieve some of the same guarantees as we do with the Haskell type system. See:

class ConfirmedOrder
  def purchase
    # do magic
    return PurchasedOrder.new
  end
end

class PurchasedOrder
end

In the same manner that I said purchase can only be called on a confirmed order in the Haskell code, now I can only call purchase on a confirmed order in the Ruby code.

Note that I did not specify in either tests or types what exactly happens when an order is purchased. It might be that a credit card is charged or an email is sent. Those are both things that we would test regardless of the type system.

My question to you, Uncle Bob, is if this is an example of behavior that you would test? If it is, I think you'll be forced to agree that types can specify behavior, and types can replace some tests. If it is not I would like to know why, and what behavior you would then test.


You might therefore say that the type system is a kind of “test” that fails for all inappropriate invocations of f. I might concede that point except for one thing – the way f is called has nothing to do with the required behavior of the system. Rather it is a test of an arbitrary constraint imposed by the programmer. A constraint that was likely over specified from the point of view of the system requirements.

I'd like to ask what an inappropriate invocation of f is. For example, if f is Order#purchase as above then is this an inappropriate invocation?

unconfirmed_order = Order.unconfirmed.find(params[:id])
unconfirmed_order.purchase

I think this is inappropriate, because I already established above that Order#purchase should not be called on unconfirmed orders. If it is not inappropriate, why is that? Let's assume it is inappropriate. Doesn't that invocation impede the required behavior of the system? As another example, is this inappropriate:

credit_card = CreditCard.new(number: nil, expiration: nil, name: nil, cvv: nil)
credit_card.charge(5.00)

Assume that the number, expiration, name and nil parameters are required to make a charge. This seems totally inappropriate to me, and also contrary to the desired behavior of the system.

Finally I'd like to propose that the constraints specified by tests and the constraints specified by types overlap. So what makes the constraint specified by a type any more arbitrary and over specified than the constraint specified by a test?


Now this is no small thing. Writing program text that is internally consistent is pretty important. Inconsistencies and ambiguities in program text can lead to misbehaviors of all kinds. So I don’t want to downplay this at all.

On the other hand, internal self-consistency does not mean the program exhibits the correct behavior. Behavior and self-consistency are orthogonal concepts. Well behaved programs can be, and have been, written in languages with high ambiguity and low internal consistency. Badly behaved programs have been written in languages that are deeply self-consistent and tolerate few ambiguities.

There's a contradiction in these two paragraphs. First you state that "inconsistencies and ambiguities in program text can lead to misbehaviors of all kinds." Then you state that "Behavior and self-consistency are orthogonal concepts." You can't have both. If they were truly orthogonal then inconsistencies could not lead to any behavioral changes, let alone incorrect ones! I suspect that you are actually speaking about two different perspectives of behavior here: that of behavior specified by the programmer; and that of behavior specified by the user. I want to be clear that these are two different sets of behavior.

The programmer's job then is to translate the behavior specified by the user into a program that has the same behavior. That translation cannot be helped by types or tests because it happens before that step. However when it comes time to write the program the programmer must specify the behavior they want to the computer. This is where types and tests can help. A self-consistent specification of behavior is one that will not misbehave. That is to say, it will do exactly as specified. It's up to the programmer to specify it correctly. Finally as to the last part of the second paragraph, a language can help, hinder, or ignore the programmer in writing a self-consistent program. If the programmer has a consistent mental model of the program then it will itself be consistent regardless of the language.


Clearly, every language chooses the latter option. No language forces the program text to be absolutely unambiguous and self consistent. Indeed, such a language would likely be impossible to create. And even if it weren’t, it would likely be impossible to use. Absolute precision and consistency has never been, nor should it ever be, the goal.

I'm unsure what you're trying to say here. It seems to me that one of the most desirable properties in a programming language is to be unambiguous and self-consistent. A language that does not have those properties would be difficult to use because it would be difficult to predict. You can see these properties being violated in undefined, implementation defined, or confusing and unpredictable behavior in languages. It's why Javascript and PHP are so often the target of jokes and frustration.


The problem with increasing the level of precision and internal self consistency, is that it implies an increase in the number of constraints. But constraints need to be specified; and specification requires notation. Therefore as the number of constraints grows, so does the complexity of the notation. The syntax and the semantics of a language grows as a function of the internal self-consistency and specificity of the language.

This paragraph made me think you were arguing against specific syntax for specifying types, not against static typing itself. Which one is it? There are strong, statically typed languages with minimal syntax for types.


Do languages like this exist? No; but the more type-safe a language is, the more internally consistent and specific it forces the programmers to be, the more it approaches that ultimate TDP condition.

What do you mean by "internally consistent and specific" here? If you mean that every type must be specified precisely as a primitive or user defined type then I'd have to say this paragraph is incorrect. Polymorphism can be worked into a strongly type-safe language and allows programmers to be generic to the point of ridiculousness. Haskell is living proof of this.

There's a relevant digression here in that when you make a function or method more generic you must also make its implementation more specific. More generic means fewer assumptions and with fewer assumptions to rely on there are less ways to implement a solution. This can be counter-intuitive, but an example might help. Consider the following functions:

id :: a -> a

not :: Bool -> Bool

foo :: Int -> Int

The id function must be able to operate on any value, and return any value. That is to say it cannot make any assumptions about its inputs or outputs. Therefore it cannot do anything but return its input, unchanged. It has one possible implementation:

id x = x

The not function takes a boolean, which I know must be one of two values. It also returns a boolean. I can easily enumerate all the possible implementations of not:

-- Implementation 1
not True = True
not False = False

-- Implementation 2
not True = False
not False = True

-- Implementation 3
not True = True
not False = True

-- Implementation 4
not True = False
not False = False

The foo function operates on integers which I know are infinitely many. Therefore there are infinitely many possible implementations of foo.

To bring it back to the point at hand, a type-safe language does not mean a specific language. Polymorphism is vitally important, and perhaps more useful in a strong type-safe language. It allows for the observation of assumptions (or the lack thereof) which allows for fewer possible implementations, less ways for functions to go wrong, and thus less required test cases.


Every step down that path increases the difficulty of using and maintaining the language. Every step down that path forces users of the language to get their type models “right” up front; because changing them later is too expensive. Every step down that path forces us back into the regime of Big Design Up Front.

In what way are you saying languages get more expensive? If you mean that a single change requires more subsequent changes to remain consistent then I'll agree with you. However every place where those changes need to be made is pointed out to you by the compiler. So the programmer has to spend very little time finding those places. Self-consistency must be maintained in general, so those changes would need to made whether or not the language forced you to make them. I know I would rather free up that mental capacity for more important work.


This was quite a long rebuttal, and I've made quite a few points along the way. I'll summarize them here:

  • Types can specify behavior
  • Types can provide stronger guarantees than tests
  • Strong, statically-typed languages can exist with minimal syntax for types
  • Self-consistency is important for programmer specified behavior to be correct, but it is not enough to determine if user-specified behavior is correct.
  • Types are not a replacement for all tests, and both can be used together to create something more than the sum of their parts.
  • A static type-checker makes a cost trade-off, trading off mental energy to maintain self-consistency for manual work to maintain self-consistency
  • A program text is desired to be unambiguous and consistent in order to allow the programmer to predict its behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment