joram/testing in python.md

## testing in python.md

      
    Raw
  

              testing in python.md
            
          
    Testing


These are some thoughts I threw together (with a little help from my friends) around testing, and more specifically unit tests.
Everything here is an opinionated statement. There are a zillion exceptions to these rules

Talk outline


the following sections
a review of an untested section of code, describe tests we would want to write for it.
a review of hard-to-test code, refactor to make it testable.

Types of Tests


Unit

a test that covers a small unit of work, verifying it works as expect.
Integration

External


we integrate with a lot of 3rd party APIs, what happens if one changes how it responds to us?

Internal


a class depending on another class tests that the full setup works.

End-to-End

user actions, such as logging in to the site, can be tested to get a full round trip request verification.
Regression


could be unit/integration/e2e
fixed a bug? write a test

Good Tests are ...

DRY

making data setup DRY and best practices on how to do it / where to put it / factory style.
~ Peter
Fixture Based

this goes with small.
Factories vs Fixture Files

Factories are a class that will give you an instance of a model, where you can specify, if you wish, fields on it, but it does have defaults.
Isolating the model instance building into factories allows for a lot of logic to live outside of the tests.

If a model changes, we have a single location to update, no the thousands of tests.

deterministic

Flaky tests are evil, they can be worse than having no tests, and you should destroy them with extreme prejudice. I suppose this is a variance on the “no side effects/isolated” bit as well though flaky can be for things like timezone issues (ex: a test only passes when it’s a particular time of day, etc).
~ Adam
Well Named

Test names should express a specific requirement.

Suggested format “unitOfWork_stateUnderTest_expectedBehavior”.
Using this convention means we can answer to the following questions without reading the code of our test methods.

What are the features and business logic of the application?
Given a particular input or state, what’s the expected behaviour?
~ Tim


A Good Integration Test is ...

Implementation Agnostic

you should be able to change an implementation of an algorithm and your tests should still pass without modification
~ Adam
A Good Unit Test is ...

Code Coverage

Code Coverage is a metric of how many lines of code are covered in our tests. If one path is not considered in the tests, those lines aren't counted.
You can use this to discover areas of code with low amounts of testing.
This shouldn't be a gold standard, since you can write tests that get complete code coverage and don't test anything.
Small

it should take a few seconds to read the test (good naming) to understand what it's intending to verify.

note: we have base test classes that do all the boiler plate in core/utils/tests.py

Arrange Act Assert (triple A)

the code of the test is arranged in 3 blocks:

Arrange: setup the specific situation (possibly asserting it's as expected)
Act: a single line, that does the action, and changes the state.
Assert: assertions proving the new state is as expected.
Bonus: this makes it easy to read if spaced out in 3 blocks

Specific


testing one function, if it's a method on a class, you have to consider the class's state
tests ideally should fail for exactly one reason. The more things a test tests, the more likely it’ll break for unrelated changes

Fast

if tests are slow, people don't run them before pushing to CI. pushing to CI just to find the tests break is slow, causes context switching, and slows down development.

slow? probably not a unit test

isolated (No Side effects)

they should be able to run in any order, we run our tests in parallel (for speed), so their order is undeterministic.
Tools

Test Driven Development (TDD)

Write the test that fails. Then go adjust the code to make the test pass. (good for regression tests)
This helps encourage testable code, keeping 3rd party code decoupled.
Concurrency

Running your tests on n-1 cores of your machine saves you time (the -1 is so we don't lock your machine up).
Grid Tests (data providers, table driven tests)

if we layout all permutations of types of input, and the expected output,
it is a good way to cover a lot of surface area.
for input, expected in [
  ("input1", "output1"),
  ("input2", "output2"),
  ("input3", "output3"),
  ("input4", "output4"),
]:
  # act
  actual = func(input)

  # asserts
  assert actual == expected
Fixture Files

a specific model setup described in a file you can load as data.
The downside is it gets out of date from the model definitions, and update them is tidious.
Golden Files

the expected output of a test, stored in a seperate file. If you are testing a realllly big output is what you expect. compare it against a golden file (known to be correct)
Libraries

unittest Library


setUp gets called before each test function
setUpTestData gets called at the beginning of the test class (once)
Moving most Arrange (see arrange/act/assert) code to the latter shaved tens of minutes off our tests. (thanks Brendon)

pytest

it's a fully featured testing library (and it supports unittest classes!)
link
pytest --exitfirst -vv --no-migrations -n 15
grid tests with permutations
@pytest.mark.parametrize("x", [0, 1])
@pytest.mark.parametrize("y", [2, 3])
@pytest.mark.parametrize("expected", [2, 3])
def test_foo(x, y, expected):
    assert func(a, b) == expected
@brendon and @nick, 30s to sell it.
Mocking

Mocking


make a fake version of the part you don't want to cover in your test.
check how the code acts when all those edge/error cases happen in the mocked object.
there are libraries that do this for you: mock_requests/responses, moto, freezetime

Patching


ANY (use sparingly)
if module foo imports function bar from module baz, then you do @patch('foo.bar') and not @patch('baz.bar').