spion/database-testing.md

## database-testing.md

      
    Raw
  

              database-testing.md
            
          
    Database testing of web applications

Testing database code is tricky. There seems to be a lack of resources on this topic (or I'm not looking at the right places). The resources I found so far focus on mocking the data access layer. This however is insufficient, as there are scenarios where it doesn't do what we want.
Mocking

Most of the resources suggest mocking techniques. Mocking is a technique where we program an object to behave as if it were another object, without implementing the underlying mechanisms necessary to re-create the whole thing. For example, this is how we would mock a database access layer based on SQL queries.
We have the function createFriends which we want to test:
function createFriends(f1, f2, done) {
   var inserts = [f1, f2].map(function(f) {    
       return function(cb) { db.query('insert into users (uname) values(?)', f.uname, cb); };
   })
   async.parallel(inserts, function(err, res) {
      db.query('select id from users where uname=? or uname=?', [f1.uname, f2.uname], afterGotIds) 
   });
   function afterGotIds(err, ids) { 
      db.query('insert into friends (id1, id2) values (?, ?)', ids, done);
   }
}

To test it, we create a mock database in the before part of the test:
    var m = mock();
    // the first 2 queries will be inserts with result null
    m.respondTo(2, 'query', 'insert into users (uname) values(?)', m.matches({uname: /(User1|User2)/}), null); 
    // the next query will request two user ids
    m.respondTo('query', 'select id from users where uname=? or uname=?', ['User1', 'User2'], 
        [1, 2]);

    // the next query will insert a friendship
    m.respondTo('query', 'insert into friends (id1, id2) values (?, ?)', [1,2], null);
    // and then we're done.
    m.done();
The mocked object is a dummy. Its like programming an AI with pre-canned responses: its not really an AI, it only knows how to answer some very specific questions. Our "database" object is the same: it only knows how to answer the queries it will be given by our createFriends function, in the specified order.
    var db = m.object();
    var createFriends = proxyquire('createFriends', {'db': db});
    
    createFriends({uname: 'User1'}, {uname: 'User2'}, function(err) {
        assert(m.allComplete(), 'everything specified was done');
    });
This means you're not testing the data access layer (DAL): instead, you're testing whether your logic generates the correct DAL calls in the correct order. If it does that, then you can guarantee that the database is in the correct state when that ends, provided that the DAL works.
Mock objects can compose. For example, the mock for a function that calls createFriends, sendMessage and a bunch of other queries could be built by composing the previously built mocks:
    var m = mock();
    mock.respondToAll(createFriendsMock);
    mock.respondToAll(sendMessageMock);
    // .. extra queries here.
    mock.respondTo(extraQuery, args, data);
There is a problem with this approach, though.
The first problem is that knowing what should happen may be a lot more complicated than knowing what the end result should be. There could be a lot of mocking involved. This is especially true with with databases. Due to this complexity, its more likely that we'll produce bugs caused by the fact that we got "what should happen" wrong.
Typically with databases, we may know what the end state should satisfy, but we don't know all the (possibly complex) operations we need to undertake to get to that position. In these cases the mock test is useless: its simply another implementation which is potentially almost as buggy as the thing it tests.
There is already a bug in our test which showcases the second problem. The real DAL will care whether the fist user was inserted first or the second was: that will change their user IDs. We got that part wrong in our mock database and our createFriends function may fail to work with the mock even though it will always work with the real database. Its just too hard to mock databases!
Why mocks

Why are mock objects even used? Simple: speed. A query may (and usually will) require orders of magnitude more time to execute on a real database. This translates to tests which are an orders of magnitude slower.
But this assumption can be challenged. For example, if your DAL supports multiple back-ends, and one of the supported backends is SQLite, you could create an in-memory fake database when testing. This should be fast enough in many cases, especially with test data. Then at the end, to ensure that you didn't hit any database-specific problems you can also run the tests on the real DB engine.
Fakes

Mocks are easy to compose, but how about fakes? They require a slightly different approach.
If we write tests with fake databases, then we would have to ensure that the fake database is in the required state before invoking our tested method. How would we do this?
Our first approach might be to simply ensure that tests are run in the correct order. For example, before testing messages we will run the tests for user registration. Then once we have users we can test if messages are exchanged.
This approach quickly fails though. Sometimes, we want more than one test to start from the same database state. Other times, the other tests may not be able to get the database into the state we need for the test. For example, the user registration test only creates one user, however to test a friend request we two.
What could a good approach be there? We could probably dump the database state in a file, then make sure to load the correct database state before running our test. But this is kind of hacky too, and it relies on the database structure not changing. Whenever the structure changes, we would need to regenerate all our dumped states.
We're too lazy to do this. What if we could generate those intermediate states as we test? Maybe we could write a function that generates the prerequisite database state using other functions, then save the state to a file and load it whenever its necessary.
On the other hand, an intermediate file might not be necessary at all. If we make sure to create data generation functions, we can simply call those with the appropriate arguments to get the database into the desired state at any point in time.
Composing tests based on fakes

To write fakes effectively, we must split our test's code into two phases:

data generation phase. This phase persists all the necessary objects into the fake and calls the function that we want to test.
data verification phase, which tests if the resulting data has the desired properties.

This should compose easily:
    //aggregate all database manipulation here
    exports.generate = function(db, arg1, arg2, ..., cb) {
        require('createFriendsTest').generate(db, myArgs);
        require('sendMessageTest').generate(db, otherArgs);
        // insert other stuff
    }

    // aggregate all tests/asserts here. Only fetching is allowed. Inserts and updates are not.
    // The generation function will be called with the specified arguments.
    exports.test = ['arg1val', 'arg2val', function(t, err, res) {
        t.equals(res[0].id, 1);
        t.end();
    });
Now we know where to start writing tests, even if our project has never used tests before!

take the topmost layer of functions, for example those that are only called by routing middleware
of those functions, pick the ones that can be called even if the database has no data (for example, user registration functions would be a good start) and write tests for them first, in the manner specified above
start writing tests for functions that require some database state. Refactor the data generation functions as you go to parameterize them when that is needed by other data generation functions.

Idempotence

Our tests should now work well, provided that we erase the entire database before running every single test. This is probably not optimal as the number of layers grow: a lot of erasing and creating will happen with the same data every single time.
To solve this, we need some way to tell if the generation function has already been run on the database with the same arguments. This means generate should never be called directly from other generate functions - instead, another function should wrap it, checking whether the data generator was already run with these arguments and if so, skipping the action.
However, this creates a different problem in cases where we want to test multiple actions being taken from the same state. The generate function will not be a guarantee for the database's state anymore - instead, other actions may or may have not been taken.
Take friend requests for example. Two possible actions are available: accept and reject. The generate function in both cases is obvious: it should call createFriendRequestTest's generate, then run either accept or reject.
If we don't delete the database after running the acceptFriendRequest test, then createFriendRequest will not run the second time. However, the rejectFriendRequest will fail: the created friend request was already accepted.
How can we solve this problem? We need some way to undo (rollback) the changes introduced by acceptFriendRequest. In this case its probably trivial (just updating a single row), but in the general case its definitely not trivial to create an undo action: think wide deletes, inserts and updates. So we can't create an undo action manually.
Another option are transactions: the tests themselves could wrap the portion of the generate function that doesn't belong to other generate functions into a transaction. This way, only the dependent state will persist across tests.
What does that mean for our tests code? It means another split:
//aggregate all prerequisites
exports.prerequisites = function(args) {
    // prepare args
    return [
       [require('./createFriendsTest'), extractedArgs],
       [require('./sendMessageTest'), otherArgs]
    ];
}
exports.execute = function(args, done) {
    // insert other stuff, e.g. mark message as read / delete message
}

// aggregate all tests/asserts here. Only fetching is allowed. Inserts and updates are not.
// The execute and prerequisites functions will be called with the specified arguments.
exports.test = [{arg1:'arg1val', arg2:'arg2val'}, function(t, err, res) {
    t.equals(res[0].id, 1);
    t.end();
});

Unfortunately, now the order of tests matters again. Therefore, our best bet is to delete the entire database for every test file, allow only one prerequisites section per file while allowing multiple tests per file, and wrap every test in a transaction which is rolled back after the test is executed...
There is probably a way to integrate this with existing testing frameworks using their before/after functions, but it doesn't seem reliable, at all.
Savepoints

Savepoints are a neat feature of relational databases. As long as you're within a transaction, they allow you remember the database state as a name, then revert the database to a named state (or alternatively release the savepoint)
Since we're already grouping tests by their prerequisites, instead of using transactions to roll back things, why not use savepoints?
    savepoint testgroup
    --run prerequisites
    --for each test:
        savepoint test
        --run test prerequisite
        rollback to savepoint test   
    rollback to savepoint testgroup
Now we can throw away the idempotence idea. Instead, we will group tests which have the same prerequisites, run them within transaction, and transparently replace transactions with savepoints in "testing mode".
This should result with good-enough performance and consistent states. Also it should make it much easier to revert to a state before a test.
Reusable state generation functions are still relevant, so the split between database modifications and fetching+tests is still necessary.
Conclusion: how to test

How do we write the tests for our web service? Two things are important: the file structure, and the order in which we do things
Order

First, we take the routes (controllers) that don't depend on any database state being present. Things like user registration are usually the first ones, or perhaps database initialization code.
Then, we create a new directory for the routing api layer. Half of these are actions (mutate state), the other half are viewmodels.
Then, for each route that doesn't depend on database state:


refactor the route to call a single function that works with the database. Url parsing and parameter instructions don't belong there: only talking to the database does.


if the route calls more than one database function, create a function that encapsulates that and move it to the appropriate file in the api directory (actions or viewmodels)


After you're done with this, write tests for this function. The test will be split into two parts:

fixture part, where you modify the database
fetching / assertions part, where you test if the database state is correct (no modifications allowed)

Now you're also ready to write tests for 2nd-level database-modifying routes.
2nd level routes are those routes that require that some database state is present. Their fixtures can therefore call the fixtures of previously written tests.
For example: an api route that results with the refactored action createMessage({from: user1, to: user2, text: text}, done) will require that user1 and user2 are created.
We can do this by importing the user registration fixture creator written in one of the previous tests for 1st-level database modifying routes. Of course, we will need to refactor this fixture function in order to parameterize it: we're going to need to call it twice to create two different users.
Then we move on to level 3, where we may have something like deleteMessage(id). At this point we see the beauty of our approach. Instead of writing a fixture to create users and then create a message, we simply call the fixture to create a message. It will automatically in turn call the fixture function for user registrations, then proceed to create a message exchanged between those two users.
This means that all our fixture functions will now consist of just a few database-interaction calls.
As a bonus, we're also testing if our stuff integrates well together. How cool is that?
Finally to optimize test performance, we can group tests and use database savepoints.
Test file structure

The test file must be loaded by a test runner (which will call module.test()) instead of executed directly, as tests will export some functions.
Test files design, first iteration. Example test is message.js
var fixtureLoader = require('fixture-loader');

var fixtures = exports.fixtures = { }
var db = require('db');

fixtures.prepare = function(args, done) { 
    var registerUser = require('./registerUser');
    async.series([
        registerUser.fixtures.prepare.bind(null),
        registerUser.fixtures.create.bind(null, args.user1), 
        registerUser.fixtures.create.bind(null, args.user2), 
    ], done);
}

fixtures.create: function(args, done) { 
   message.send(args.user1.id, args.user2.id, args.msg, done);
}

var t = require('tap');

exports.test = function() {
    t.test('prerequisites', function(t) { 
        fixtures.prepare({user1: user1, user2: user2}, t.done) 
    });
    t.test('create message', function(t) {
        db.test.begin();
        fixtures.create({
            user1: user1, user2: user2, msg: 'Hi'
        }, function() {
            messages.fetch(user2, function(err, msgs) { 
                // assert that data is okay
                t.ok(msgs.filter(function(m) { 
                    return m.text == 'Hi'; 
                }).length, 'Message arrived');
                db.test.restore(); // restores state to before-the-test.
            });

            t.end();
        });
    });
};
This test exports its fixture functions.
This illustrates why we need a generic "prepare" fixture. Sometimes, we want to call fixture functions more than once: in this example, we want to register two users. But the preparation function for the user creation test contains database initialization stuff, which we only want to call once. Therefore we want to separate these.