zarazek/range-server-testing.md

## range-server-testing.md

      
    Raw
  

              range-server-testing.md
            
          
    Importance of automatic (unit testing).

Automatic testing is crucial for achieving good code quality. While manual testing of individual features / bug fixes may be OK,
manual regression testing is very dedious and usually not feasible. Additionally, writing unit tests ususally aids in creating
good software architecture, because it tends to enfoce good interfaces and boundaries between subsystems.
Current state

Currently writing unit tests for range-server is quite difficult. Here is why:
Mocking

Final tagless style

Business logic is written in MTL / tagless-final style, which in principle should aid in writhing unit tests. Logic is wirtten
for abstract monad with constraints. There is production monad stack (RangeServerT in range-server case),
fulfilling all the necessary constratins and in principle there could be a test monad stack, that could be used in unit tests,
but there are two classes particularly tied to their low-level implementations and prevent taking advantage of final tagless style.
MonadDatabase and MonadPostgress classes

These classes are providing database functionality. There is no separate ORM / storage layer, users of these classes have
access to full power of SQL. A lot of business logic is encoded in SQL: constraints, user-defined SQL types
and sometimes quite complex queries. This is a valid design choice, but makes it separating the storage layer very diffucult,
and thus makes it necessary to deal with the real database during testing. Fortunately this is possible with aid of
tmp-postgres package, but using it creates its own set of problems (discussed below).
MonadHypervisor clsss

When we see this class in the constraints, it means that the code using it is talking to vcenter API. Unfortunatelly, the class
itself doesn't provide the connectivity functionality. Instead, it just provides some caching layer (which should probably
be transparent to the user) on top of the code written in raw IO (well, MonadIO).
Full URLs in the database

For objects that have representations in both database and vcenter API, full URLs are stored in the database instead of just IDs.
isHypervisorForbidden trick

isHypervisorForbidden is a method from MonadHypervisor class that in principle lets the class user check if it is running
in test environment and skip some API checks. This is bad idea for at least two reasons:

You have to explicitly use this functionality in business logic and there would be no error at the compile time when you omit
this check and call vcenter API unconditionally, and the result would be code under test trying to do real http calls.
When you have two code paths, "real" and test one, it is easy to introduce subtle differences among them. It is better for code
under the test not to know that it is tested and not run "for real".

Other Reader-like classes (RepsService, ACBroadcast, MonadConsoles, MonadContentRepo...)

There are multiple Reader-like classes in range-server code, they provide access to some specific functionality by providing
method to access some handle function or low-level implementation of the functionality. These can be relatively easily mocked,
but final tagless style doesn't provide any benefits over handle pattern / RIO style, just increases boilerplate.
Database objects

One cann't just insert a row into arbitrary database table for testing purpose. To satisfy all the database and code constraints,
whole often whole graph of entities is needed. For example, to insert one row into network_clone_sources, entities in 12 other
tables are needed.
Slowness of the tests

Tests involving temporary PostgreSQL instances are slow. During the test new instance of PostgreSQL is spun,
and then all the database migrations are applied. After the test is finished database is shut down.
API tests are not the solution

API tests are very heavyweight (even slower than tests involving temporary Postgres instance) and flaky, so they probably should
be the last line of defence.
Proposed solutions / improvements

Mock VMWare API

There exists VMWare API abstraction in file range-server/lib/range-server-vmware/src/Range/Hypervisor.hs, but it is not used consequently in range-server code: some places are refering directly to VMWare SOAP API. This abstraction is currently being changed by Alex Hansen, to support other hypervisors than VMWare.
Alternatively, we can mock VMWare API at the HTTP/SOAP level. I propose to create "HTTP requestor" abstraction. Production instance would issue real HTTP requests to configured base adresses and mock implementation can be programmed to provide specified responses to anticipated requests.
Function for creating standard graph of database entities

Standard graph of database entities (organization, user, image, template, etc.) should be present in most of range-server tests. These entities can be inserted into database by test setup code, test code would then deal with entities specific to the particular test case.
Reuse database instance across multiple tests

We can run test code wrapped in single transaction that is rolled back afterwards. This functionality is already there,
but it's unused / not used properly, so database is being set up and torn down multiple times.
Use squeal for database access where possible

This is not a testing issue, but worthy improvemnt anyway:
squeal is the only library that checks validity of SQL queries against the database schema. Utility called squealgen (integrated with gabe) can be used to retrieve the schema from live database and generate Haskell code from this (schema-db package). Checks provided by squeal are especially important especially when database migrations are made: we can check validity of all squeal queries agains modified schema at compile time.
The downside of squeal is long compilation time. When squeal code becomes too costly to compile, we can drop back to postgresql-simple.
Make squealgen results deterministic and make it mThere are scripts for retrieving

The shortcoming of squealgen is that the code it generates is not deterministic: the order of definitions can vary. In result, output of squealgen can be different even if run twice without any code changes. This prevents incorporating squealgen into CI process. If we fix squealgen to generate deterministic code, it can be run during checking of every PR, and no database migrations will be missed.