csasbach/testing.md

## testing.md

      
    Raw
  

              testing.md
            
          
    Testing

Why?


In order to test your code, your code must first be testable.  If you write testable code from the beginning, you will be writing well-designed code or at least code that is more well designed than you would have otherwise written.  If you are adding tests to an existing code base, you will be led by the hand into precisely the refactoring that code base needs.  Testable code is, by virtue, loosely coupled code.  Nearly all of the important design principles are geared towards keeping your code loosely coupled.  If you only focused on making your tests easy to write and paid no attention to design principles at all, you will still have accomplished a pretty good design nearly by accident.  The more loosely coupled your code is, the easier it is to change.  The first virtue of good code is changeability, usability comes second.  Code that is usable today but cannot be easily be changed may not be usable tomorrow and will only be made usable again with significant effort.  Code that is easily changeable can always be changed into code that is usable with no more effort than what that task legitimately requires.  To use the technical debt metaphor: Tested code is an investment, untested code is a loan.  If you are always taking out loans and never making investments then you will end up working very hard and long hours to stay solvent.
Having fast and thorough unit test coverage helps you code faster because you can trust your baseline, make small changes, see what impact those changes have made (Have any tests broken?  Has code coverage decreased?), fix or write new tests until everything is green and coverage is as high as possible again, and repeat.  This is your personal feedback loop.  If you keep it fast and keep it thorough, you can move quickly and with confidence from commit to commit.
Having true unit tests (if a unit test has an external dependency then it's an integration test masquerading as a unit test) that run on ephemeral architecture (build agents) in your CI/CD pipeline protects you from the classic 'it worked on my machine' blunder.  If there is some special local environmental condition that must be true in order for your application to execute without run-time errors then it will be found on your build agent.  If you have infrastructure as code to build that build agent, then these dependencies are now self-documenting, rather than tribal knowledge that lives on a handful of developer machines.  This is the team's feedback loop.  If it is kept fast and thorough, then the team can move quickly and with confidence from PR merge to PR merge.
Having integration tests in each environment to which you deploy helps you detect flaws in your architectural design, configuration, package management, tooling, third party integrations, etc. (all of the things that live outside of your code that your application needs to function) as soon and as quickly as possible.  Integration tests are inherently slow and unreliable.  Therefore, they must be kept few.  Only test each integration with one test, now is not the time for exercising permutations, do that in unit tests and use mocks.  This is your release feedback loop.  If it is kept fast and thorough, then you will be able to deliver value to your customers quickly and with confidence from version to version.

Links

Rather than provide lengthy guidelines on testing here in this document, I urge you to consume the materials below, particularly 'The Practical Test Pyramid'  these resources really say all I could hope to and more on the matter of how to properly implement tests in the general sense.  More specific guidelines should be found in the language-specific documentation.
Magic Tricks of Testing (Sandi Metz)
The Practical Test Pyramid (Ham Vocke)
Test Sizes (Simon Stewart)
Code Coverage vs. Test Coverage (Shreya Bose)
Good Things for a Test to Have


Structure

Giving your test structure makes them more readable and provides a discipline to how you write them.  Break test method implementations into these three sections:

ARRANGE (Given)
This section should contain the things you are doing to set up the preconditions for the test and should not exercise the code under test or make any assertions.
ACT (When)
This section should ONLY exercise the code under test.  It should not perform any actions on other code or make any assertions.
ASSERT (Then)
This section should make only assertions.  No further actions should be performed on any code outside of the test itself.


Exceptions to the rule

Sometimes you will need to combine sections to perform certain tasks such as combining ACT + ASSERT while executing some code that will throw an exception and also asserting on the exception that’s expected.
Sometimes you are declaring functions in the ARRANGE that will be executed during the ACT or ASSERT.  Try labeling the sections of these functions accordingly.


A maximum duration

Validates an acceptable performance baseline for the code under test.  Your tests should fail if you have introduced a change that has made this code run slower.
Collectively, guarantees that your test suite will complete in an acceptable amount of time.  This ensures your feedback loops remain healthy.


The right assertions/expectations

As few assertions as possible, preferably one

Easier to troubleshoot and fix
More likely to have a failure message that’s accurate


Don’t require more than absolutely necessary

If you are testing a contract, assert on request/call/response/return structure, don’t assert on specific data.
If you only need to know that something was called, then assert that it was called, not what the system does as a consequence.


A way to efficiently handle large numbers of permutations

Parameterize your test method, or a method run directly by a test method.
Support passing in assertions/expectations as test parameters
Arrange what will not change once and only once at the top of your test method or in the setup of your test suite.
Arrange a map of test params indexed by a test case name somewhere in scope just outside or inside your test method.
Have your test method iterate over your map of test cases.


Meaningful output printed to the console

This is going to help you see what your test actually did, not just that it passed.  You may find that your passing test is doing something bad that you weren’t testing for.  You may see something wrong in a failing test right away in the output without having to step through the debugger, isn’t that nice?
You are going to know immediately when you are having problems with asynchronous race conditions when you see the chronologically undesirable output displayed.  Without this verbose test output, these kinds of problems can sleep in your code.


Simple Arrangement

As few dependencies to instantiate as possible.  Even if these are mocked, having too many is a code smell.  This is a problem with the design of your other code, not the test itself.  Ideally, you fix the code under test so the test can be better.
As few steps as possible.  You shouldn’t need to simulate a long series of steps before arriving at the system state you want to test.  The code under test should be decoupled to the point where the state just before your test condition can be simulated directly.
Don’t depend on external resources (more than you have to)

Mock everything other than the code under test if it is a unit test
If you didn’t do the previous step, then it’s an integration test.  Don’t depend on specific state conditions that aren’t relevant to what you are testing.  For example, don’t rely on a remote endpoint to return any specific data, instead only depend on it returning data that will fulfill the contract.

Example:  Need to test an endpoint that expects a filter argument to be passed in?  The state you are filtering on exists in the superset, so have a method that pulls back the superset of data, extracts values that exist into the superset for fields that will be filtered on, and chooses one or more at random to use in your filter argument.  The method under test should be able to return a valid response (not a specific response).  A test written this way should pass regardless of what data actually exists in the environment or not.  It should only fail if the endpoint doesn’t honor the contract, or if the test has not properly modeled it.

For enhanced confidence, run multiple iterations of tests that depend on randomly selected inputs.


Minimized or mitigated asynchronous code

If at all possible, only unit test synchronous code (very hard to do in some platforms, admittedly)
Some platforms are intrinsically asynchronous.  To test well in these platforms you will need to take steps to make sure that:

Test method and test case iteration logic runs as contiguously as possible (not mixed with other test executions).  This is for sanity when reading test output but also so that unexpected conditions don’t arise when tests run simultaneously.  The counterargument here is that you may specifically want to test that the code performs as expected even when context switching, for this I would suggest a test specifically designed for this purpose that wraps the async code and asserts only after ALL execution is complete.  This wrapper test should run contiguously.
The duration of distinct behaviors is still testable.  This is closely related to having test logic run contiguously.  You need code that can be guaranteed to run both before-and-only-before and after-and-only-after your code under tests if you are to have any hope of getting reliable duration calculations.


Declarative helper functions

When you have to do repetitive and/or long-winded things in support of a test, extract this logic into methods and define them at whatever scope they are most useful, just outside the test, just outside the test suite, in a test helpers class in the test suite directory, somewhere higher up in the test directory tree, or globally, just under the root test directory.