c-cube/qtest.adoc

## qtest.adoc

      
    Raw
  

              qtest.adoc
            
          
    QTest


Table of Contents

Introduction
Using qtest: a Quick, Simple Example
More qtest Pragmas

Different Kinds of Tests

Simple Tests: T for ``Test''
Equality Tests: =
Randomized Tests: Q for ``Quickcheck''
Raw OUnit Tests: R for ``Raw''
Exception-Throwing Tests: E for ``Exception''


Manipulation Pragmas

Opening Modules: open Pragma <…> and --preamble Option
Code Injection Pragma:


Technical Considerations and Other Details

Function Coverage
Headers and Metaheaders

Aliases
Mutually Tested Functions
Testing Functions by the Dozen
Metaheaders Unleashed
Header Parameters Injection


Warnings and Exceptions Thrown by qtest
qtest Command-Line Options


Editor Support


Introduction


In a nutshell, qtest is a small program which reads .ml and .mli source
files and extracts inline unit tests from them. It is used internally by
the OCaml Batteries project,
and is shipped with it as of version 2.0, but it does not
depend on it and can be compiled and used independently.


Browse its code in the
Github Repository.


Using qtest: a Quick, Simple Example


Say that you have a file foo.ml, which contains the implementation of
your new, shiny function foo.


let rec foo x0 f = function
  | [] -> 0 | x::xs -> f x (foo x0 f xs)


Maybe you don’t feel confident about that code; or maybe you do, but you
know that the function might be re-implemented less trivially in the
future and want to prevent potential regressions. Or maybe you simply
think unit tests are good practice anyway. In either case, you feel that
building a separate test suite for this would be overkill. Using qtest,
you can immediately put simple unit tests in comments near foo, for
instance:


(*$T foo
  foo  0 ( + ) [1;2;3] = 6
  foo  0 ( * ) [1;2;3] = 0
  foo  1 ( * ) [4;5]   = 20
  foo 12 ( + ) []      = 12
*)


the syntax is simple: (*$ introduces a qtest pragma'', such as T
in this case. T is by far the most common and represents a simple''
unit test. T expects a header'', which is most of the time simply
the name of the function under test, here foo. Following that, each
line is a statement'', which must evaluate to true for the test to
pass. Furthermore, foo must appear in each statement.


Now, in order to execute those tests, you need to extract them; this is
done with the qtest executable. The command


$ qtest -o footest.ml extract foo.ml
Target file: `footest.ml'. Extraction : `foo.ml' Done.


will create a file footest.ml; it’s not terribly human-readable, but
you can see that it contains your tests as well as some
OUnit
boilerplate. Now you need to compile the tests, for instance with
ocamlbuild, and assuming OUnit was installed for ocamlfind.


$ ocamlbuild -cflags -warn-error,+26 -use-ocamlfind -package oUnit \
    footest.native
Finished, 10 targets (1 cached) in 00:00:00.


Note that the -cflags -warn-error,+26 is not indispensable but
strongly recommended. Its function will be explained in more detail in
the more technical sections of this documentation, but roughly it makes
sure that if you write a test for foo, via (*$T foo for instance,
then foo is actually tested by each statement – the tests won’t
compile if not.


Important note: in order for this to work, ocamlbuild must know
where to find foo.ml; if footest.ml is not in the same directory,
you must make provisions to that effect. If foo.ml needs some specific
flags in order to compile, they must also be passed.


Now there only remains to run the tests:


------------------------------------------------------------------------------
$ ./footest.native
..FF
==============================================================================
Failure: qtest:0:foo:3:foo.ml:10

OUnit: foo.ml:10::>  foo 12 ( + ) [] = 12
------------------------------------------------------------------------------
==============================================================================
Failure: qtest:0:foo:2:foo.ml:9

OUnit: foo.ml:9::>  foo 1 ( * ) [4;5] = 20
------------------------------------------------------------------------------
Ran: 4 tests in: 0.00 seconds.
FAILED: Cases: 4 Tried: 4 Errors: 0 Failures: 2 Skip:0 Todo:0
------------------------------------------------------------------------------


Oops, something’s wrong… either the tests are incorrect or foo is.
Finding and fixing the problem is left as an exercise for the reader.
When this is done, you get the expected


$ ./footest.native
....
Ran: 4 tests in: 0.00 seconds.


Tip


those steps are easy to automate, for instance with a small shell
script:


set -e # stop on first error
qtest -o footest.ml extract foo.ml
ocamlbuild -cflags -warn-error,+26 -use-ocamlfind -package oUnit footest.native
./footest.native


More qtest Pragmas


Different Kinds of Tests


Simple Tests: T for ``Test''


The most common kind of tests is the simple test, an example of which is
given above. It is of the form


(*$T <header>
  <statement>
  ...
*)


where each statement must be a boolean OCaml expression involving the
function (or functions, as we will see when we study headers) referenced
in the header. The overall test is considered successful if each
statement evaluates to true. Note that the `close comment'' `*)
must appear on a line of its own.


Tip: if a statement is a bit too long to fit on one line, if can be
broken using a backslash (\), immediately followed by the carriage
return. This also applies to randomised tests.


Equality Tests: =


The vast majority of test cases tend to involve the equality of two
expressions; using simple tests, one would write something like:


(*$T foo
  foo 1 ( * ) [4;5] = foo 3 ( * ) [1;5;2]
*)


While this certainly works, the failure report for such a test does not
convey any useful information besides the simple fact that the test
failed. Wouldn’t it be nice if the report also mentioned the values of
the left-hand side and the right-hand side ? Yes it would, and
specialised equality tests provide such functionality, at the cost of a
little bit of boilerplate code. The bare syntax is:


(*$= <header>
  <lhs> <rhs>
  ...
*)


However, used bare, an equality test will not provide much more
information than a simple test: just a laconic not equal''. In order
for the values to be printed, a value printer'' must be specified for
the test. A printer is a function of type
'a → string, where 'a is
the type of the expressions on both side of the equality. To pass the
printer to the test, we use parameter injection (cf. Section
Header Parameters Injection); equality tests have an optional argument printer for
this purpose. In our example, we have
'a = int, so the test becomes simply:


(*$= foo & ~printer:string_of_int
  (foo 1 ( * ) [4;5]) (foo 3 ( * ) [1;5;2])
*)


The failure report will now be more explicit, saying
expected: 20 but got: 30.


Randomized Tests: Q for ``Quickcheck''


Quickcheck is a small library useful for randomized unit tests. Using it
is a bit more complex, but much more rewarding than simple tests.


(*$Q <header>
  <generator> (fun <generated value> -> <statement>)
  ...
*)


Let us dive into an example straight-away:


(*$Q foo
  Q.small_int (fun i-> foo i (+) [1;2;3] = List.fold_left (+) i [1;2;3])
*)


The Quickcheck module is accessible simply as Q within inline tests;
small_int is a generator, yielding a random, small integer. When the
test is run, each statement will be evaluated for a large number of
random values – 100 by default. Running this test for the
above definition of foo catches the mistake easily:


law foo.ml:14::>  Q.small_int (fun i-> foo i (+) [1;2;3]
    = List.fold_left (+) i [1;2;3])
failed for 2


Note that the random value for which the test failed is provided by the
error message – here it is 2. It is also possible to generate several
random values simultaneously using tuples. For instance


(Q.pair Q.small_int (Q.list Q.small_int)) \
  (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)


will generate both an integer and a list of small integers randomly. A
failure will then look like


law foo.ml:15::>  (Q.pair Q.small_int (Q.list Q.small_int))
    (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)
failed for (727, [4; 3; 6; 1; 788; 49])


Available Generators:


Simple generators

unit, bool, float, pos_float, neg_float, int, int32,
int64, pos_int, small_int, neg_int, char, printable_char,
numeral_char, string, printable_string, numeral_string

Structure generators

list and array. They take one generator as their argument. For
instance (Q.list Q.neg_int) is a generator of lists of (uniformly
taken) negative integers.

Tuple generators

pair and triple are respectively binary and ternary. See above for
an example of pair.

Size-directed generators

string, numeral_string, printable_string, list and array all
have *_of_size variants that take the size of the structure as their
first argument.


Tips:


Duplicate Elements in Lists

When generating lists, avoid
Q.list Q.int unless you have a good reason to do so. The reason is
that, given the size of the Q.int space, you are unlikely to generate
any duplicate elements. If you wish to test your function’s behaviour
with duplicates, prefer Q.list Q.small_int.

Changing Number of Tests

If you want a specific test to execute
each of its statements a specific number of times (deviating from the
default of 100), you can specify it explicitly through
parameter injection (cf. Section Header Parameters Injection) using the count :
argument.

Getting a Better Counterexample

By default, a random test stops as
soon as one of its generated values yields a failure. This first failure
value is probably not the best possible counterexample. You can force
qtest to generate and test all count random values regardless, and to
display the value which is smallest with respect to a certain measure
which you define. To this end, it suffices to use parameter injection to
pass argument small : 'a → 'b, where
'a is the type of generated values and
'b is any totally ordered set (wrt. <).
Typically you will take 'b = int or 'b = float. Example:


let fuz x = x
let rec flu = function
  | [] -> []
  | x :: l -> if List.mem x l then flu l else x :: flu l

(*$Q fuz; flu & ~small:List.length
  (Q.list Q.small_int) (fun x -> fuz x = flu x)
*)


The meaning of ~small:List.length is therefore simply:
    `choose the shortest list''. For very complicated cases, you can simultaneously
increase `count to yield an even higher-quality counterexample.


Raw Random Tests

Using ($QR, similar to the raw unit test ($R, it is possible to
write a random test on multiple lines without the trailing \
characters.


(*$QR foo
  Q.small_int
    (fun i->
      foo i (+) [1;2;3] = List.fold_left (+) i [1;2;3]
    )
*)


The (*$QR block needs to contain exactly two values:


Random Generator

of type 'a Quickcheck.arbitrary

Property to test

of type 'a → bool


Raw OUnit Tests: R for ``Raw''


When more specialised test pragmas are too restrictive, for instance if
the test is too complex to reasonably fit on one line, then one can use
raw OUnit tests.


(*$R <header>
  <raw oUnit test>...
  ...
*)


Here is a small example, with two tests stringed together:


(*$R foo
  let thing = foo  1 ( * )
  and li = [4;5] in
  assert_bool "something_witty" (thing li = 20);
  assert_bool "something_wittier" (foo 12 ( + ) [] = 12)
*)


Note that if the first assertion fails, the second will not be executed;
so stringing two assertions in that mode is different in that respect
from doing so under a T pragma, for instance.


That said, raw tests should only be used as a last resort; for instance
you don’t automatically get the source file and line number when the
test fails. If T and Q do not satisfy your needs, then it is
probably a hint that the test is a bit complex and, maybe, belongs in
a separate test suite rather than in the middle of the source code.


Exception-Throwing Tests: E for ``Exception''


… not implemented yet…


The current usage is to use (*$T and the following pattern for
function foo and exception Bar:


try ignore (foo x); false with Bar -> true


If your project uses Batteries and no pattern-matching is needed, then
you can also use the following, sexier pattern:


Result.(catch foo x |> is_exn Bar)


Manipulation Pragmas


Not all qtest pragmas directly translate into tests; for non-trivial
projects, sometimes a little boilerplate code is needed in order to set
the tests up properly. The pragmas which do this are collectively called
``manipulation pragmas''; they are described in the next section.


Opening Modules: open Pragma <…> and --preamble Option


The tests should have access to the same values as the code under test;
however the generated code for foo.ml does not actually live inside
that file. Therefore some effort must occasionally be made to
synchronise the code’s environment with the tests’. There are three main
usecases where you might want to open modules for tests:


Project-Wide Global Open

It may happen that every single file in your project opens a given
module. This is the case for Batteries, for instance, where every module
opens Batteries. In that case simply use the –preamble switch. For
instance,


qtest --preamble "open Batteries;;"  extract mod1.ml mod2.ml ... modN.ml


Note that you could insert arbitrary code using this switch.
c


Global Open in a File

Now, let’s say that foo.ml opens Bar and Baz; you want the tests
in foo.ml to open them as well. Then you can use the open pragma in
its global form:


(*$< Bar, Baz >*)


The modules will be open for every test in the same .ml file, and
following the pragma. However, in our example, you will have a
duplication of code between the `open'' directives of `foo.ml, and the
open pragma of qtest, like so:


open Bar;; open Baz;;
(*$< Bar, Baz >*)


It might therefore be more convenient to use the code injection pragma
(see next section) for that purpose, so you would write instead:


(*${*) open Bar;; open Baz;; (*$}*)


The code between that special markup will simply be duplicated into the
tests. The two methods are equivalent, and the second one is
recommended, because it reduces the chances of an impedance mismatch
between modules open for foo.ml and its tests. Therefore, the global
form of the open pragma should preferentially be reserved for cases
where you want such a mismatch. For instance, if you have special
modules useful for tests but useless for the main code, you can easily
open then for the tests alone using the pragma.


Local Open for a Submodule

Let’s say we have the following foo.ml:


let outer x = <something>

module Submod = struct
  let inner y = 2*x
  (*$T inner
    inner 2 = 4
  *)
end


That seems natural enough… but it won’t work, because qtest is not
actually aware that the test is `inside'' Submod (and making it aware
of that would be very problematic). In fact, so long as you use only
test pragmas (ie. no manipulation pragma at all), the positions and even
the order of the tests – respective to definitions or to each other –
are unimportant, because the tests do not actually live in `foo.ml. So
we need to open Submod manually, using the local form of the open
pragma:


module Submod = struct (*$< Submod *)
  let inner y = 2*x
  (*$T inner
    inner 2 = 4
  *)
end (*$>*)


Notice that the <…> have simply been split in two, compared to the
global form. The effect of that construct is that Submod will be open
for every test between ($< Submod *) and ($>*). Of course, you
could also forgo that method entirely and do this:


module Submod = struct
  let inner y = 2*x
  (*$T &
    Submod.inner 2 = 4
  *)
end


… but it is impractical and you are forced to use an empty header
because qualified names are not acceptable as headers. The first method
is therefore strongly recommended.


Code Injection Pragma:


TODO: ocamldoc comments that define unit tests from the offered examples


Technical Considerations and Other Details


What has been said above should suffice to cover at least 90% of
use-cases for qtest. This section concerns itself with the remaining
10%.


Function Coverage


The headers of a test are not just there for decoration; three
properties are enforced when a test, say, (*$X foo is compiled, where
X is T, R, Q, QR,… :


foo exists; that is to say, it is defined in the scope of the module
where the testappears – though one can play with pragmas to relax this
condition somewhat. At the very least, it has to be defined
somewhere. Failure to conform results in an
Error: Unbound value foo.


foo is referenced in each statement of the test: for T and Q,
that means each line''. For R, that means once somewhere in the
test’s body''. Failure to conform results in a
Warning 26: unused variable foo, which will be treated as an error if
-warn-error +26 is passed to the compiler. It goes without saying that
this is warmly recommended.


the test possesses at least one statement.


Those two conditions put together offer a strong guarantee that, if a
function is referenced in a test header, then it is actually tested at
least once. The list of functions referenced in the headers of extracted
tests is written by qtest into qtest.targets.log. Each line is of the
form


foo.ml   42    foo


where foo.ml is the file in which the test appears, as passed to
extract, and 42 is the line number where the test pragma appears in
foo.ml. Note that a same function can be listed several times for the
same source file, if several tests involve it (say, two times if it has
both a simple test and a random one). The exact number of statements
involving foo in each test is currently not taken into account in the
logs.


Headers and Metaheaders


The informal definition of headers given in the above was actually a
simplification. In this section we explore two syntaxes available for
headers.


Aliases


Some functions have exceedingly long names. Case in point :


let rec pretentious_drivel x0 f = function
  | [] -> x0
  | x::xs -> pretentious_drivel (f x x0) f xs


(*$T pretentious_drivel
  pretentious_drivel 1 (+) [4;5] = foo 1 (+) [4;5]
  ... pretentious_drivel of this and that...
*)


The constraint that each statement must fit on one line does not play
well with very long function names. Furthermore, you known which
function is being tested, it’s right there is the header; no need to
repeat it a dozen times. Instead, you can define an alias, and write
equivalently:


(*$T pretentious_drivel as x
  x 1 (+) [4;5] = foo 1 (+) [4;5]
  ... x of this and that...
*)


…thus saving many keystrokes, thereby contributing to the
preservation of the environment. More seriously, aliases have uses
beyond just saving a few keystrokes, as we will see in the next
sections.


Mutually Tested Functions


Most of the time, a test only pertains to one function. There are times,
however, when one wishes to test two functions – or more – at the same
time. For instance


let rec even = function 0 -> true
  | n -> odd (pred n)
and odd = function 0 -> false
  | n -> even (pred n)


Let us say that we have the following test:


(*$Q <header>
  Q.small_int (fun n-> odd (abs n+3) = even (abs n))
*)


It involves both even and odd. That question is: what is a proper
header for this test''? One could simply put even'', and thus it would
be referenced as being tested in the logs, but odd would not, which is
unfair. Putting ``odd'' is symmetrically unfair. The solution is to put
both, separated by a semi-colon:


(*$Q even; odd


That way both functions are referenced in the logs:


    foo.ml   37    even
    foo.ml   37    odd


and of course the compiler enforces that both of them are actually
referenced in each statement of the test. Of course, each of them can be
written under alias, in which case the header could be
even as x; odd as y.


Testing Functions by the Dozen


Let us come back to our functions foo (after correction) and
pretentious_drivel, as defined above.


let rec foo x0 f = function
  | [] -> x0
  | x::xs -> f x (foo x0 f xs)

let rec pretentious_drivel x0 f = function
  | [] -> x0
  | x::xs -> pretentious_drivel (f x x0) f xs


You will not have failed to notice that they bear more than a passing
resemblance to one another. If you write tests for one, odds are that
the same test could be useful verbatim for the other. This is a very
common case when you have closely related functions, or even several
implementations of the same function, for instance the old, slow,
naïve, trustworthy one and the new, fast, arcane, highly optimised
version you have just written. The typical case is sorting routines, of
which there are many flavours.


For our example, recall that we have the following test for foo:


(*$Q foo
  (Q.pair Q.small_int (Q.list Q.small_int)) \
    (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)
*)


The same test would apply to pretentious_drivel; you could just
copy-and-paste the test and change the header, but it’s not terribly
elegant. Instead, you can just just add the other function to the
header, separating the two by a comma, and defining an alias:


(*$Q foo, pretentious_drivel as x
  (Q.pair Q.small_int (Q.list Q.small_int)) \
  (fun (i,l)-> x i (+) l = List.fold_left (+) i l)
*)


This same test will be run once for x = foo, and once for
x = pretentious_drivel. Actually, you need not define an alias: if the
header is of the form


(*$Q foo, pretentious_drivel


then it is equivalent to


(*$Q foo, pretentious_drivel as foo


so you do not need to alter the body of the test if you subsequently add
new functions. A header which combines more than one ``version'' of a
function in this way is called a metaheader.


Metaheaders Unleashed


All the constructs above can be combined without constraints: the
grammar is as follows:


    Metaheader  ::=   Binding {";" Binding}
    Binding     ::=   Functions [ "as" ID ]
    Functions   ::=   ID {"," ID}
    ID          ::=   (*OCaml lower-case identifier*)


Header Parameters Injection


Use (*$inject foo *) to inject the piece of code foo at the
beginning of this module’s tests. This is useful, for instance, to
define frequently used random generators, or printers, or to instantiate
a functor before testing it.


Warnings and Exceptions Thrown by qtest


Fatal error: exception Failure("Unrecognised qtest pragma: ` T foo'")


You have written something like ($ T foo; there must not be any space
between ($ and the pragma.


Warning: likely qtest syntax error: `(* $T foo'. Done.


Self-explanatory; if $ is the first real character of a comment, it’s
likely a mistyped qtest pragma. This is only a warning though.


Fatal error: exception Core.Bad_header_char("M", "Mod.foo")


You have used a qualified name in a header, for instance (*$T Mod.foo.
You cannot do that, the name must be unqualified and defined under the
local scope. Furthermore, it must be public, unless you have used
pragmas to deal with private functions.


Error: Comment not terminated
Fatal error: exception Core.Unterminated_test(_, 0)


Most probably, you forgot the comment-closing *) to close some test.


Fatal error: exception Failure("runaway test body terminator: n))*)")


The comment-closing *) must be on a line of its own; or, put another
way, every statement must be ended by a line break.


qtest Command-Line Options


$ qtest --help

** qtest (qtest)
USAGE: qtest [options] extract <file.mli?>...

OPTIONS:
--output <file.ml>    (-o) def: standard output
  Open or create a file for output; the resulting file will be an OCaml
  source file containing all the tests.

--preamble <string>   (-p) def: empty
  Add code to the tests' preamble; typically this will be an instruction
  of the form 'open Module;;'


--help          Displays this help page and stops


Editor Support


coming soon