Skip to content

Instantly share code, notes, and snippets.

@krishnadey30
Last active May 22, 2019 00:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save krishnadey30/dccbb516a6346483130b4a0fdb65b7cc to your computer and use it in GitHub Desktop.
Save krishnadey30/dccbb516a6346483130b4a0fdb65b7cc to your computer and use it in GitHub Desktop.
This gist hold information about different Unit Test Framework.

Unit Test Frameworks

  • Julia
  • Go
  • D

Julia's Unit Testing

Julia provides a Function Base.runtests for testing the Base Julia.

If you build Julia from source, you can run this test suite with make test. In a binary install, you can run the test suite using Base.runtests().

Base.runtestsFunction.

Base.runtests(tests=["all"]; ncores=ceil(Int, Sys.CPU_THREADS / 2),
              exit_on_error=false, [seed])

You can list the test in tests , which can be either a string or an array of strings, that will be run using ncoresprocessors. If exit_on_error is false, when one test fails, all remaining tests in other files will still be run; they are otherwise discarded, when exit_on_error == true.

The Test module provides simple unit testing functionality which can be performed with the @test and @test_throws macros:

@test ex

This macro checks if the expression ex evaluates to true. If it does it returns a Pass result, a Fail result if it is false, and an Error result if it could not be evaluated.

Examples

julia> @test true
Test Passed

julia> @test [1, 2] + [2, 1] == [3, 3]
Test Passed
@test_throws exception expr

This macro checks that the expression expr throws exception. The exception may specify either a type, or a value (which will be tested for equality by comparing fields).

Examples

julia> @test_throws BoundsError [1, 2, 3][4]
Test Passed
      Thrown: BoundsError

julia> @test_throws DimensionMismatch [1, 2, 3] + [1, 2]
Test Passed
      Thrown: DimensionMismatch

The @testset macro can be used to group tests into sets. All the tests in a test set will be run, and at the end of the test set a summary will be printed. If any of the tests failed, or could not be evaluated due to an error, the test set will then throw a TestSetException.

Examples

julia> @testset "trigonometric identities" begin
           θ = 2/3*π
           @test sin(-θ)  -sin(θ)
           @test cos(-θ)  cos(θ)
           @test sin(2θ)  2*sin(θ)*cos(θ)
           @test cos(2θ)  cos(θ)^2 - sin(θ)^2
       end;
Test Summary:            | Pass  Total
trigonometric identities |    4      4

We can put our tests for the foo(x) function in a test set:

julia> foo(x) = length(x)^2
foo (generic function with 1 method)

julia> @testset "Foo Tests" begin
           @test foo("a")   == 1
           @test foo("ab")  == 4
           @test foo("abc") == 9
       end;
Test Summary: | Pass  Total
Foo Tests     |    3      3

Test sets can also be nested:

julia> @testset "Foo Tests" begin
           @testset "Animals" begin
               @test foo("cat") == 9
               @test foo("dog") == foo("cat")
           end
           @testset "Arrays $i" for i in 1:3
               @test foo(zeros(i)) == i^2
               @test foo(fill(1.0, i)) == i^2
           end
       end;
Test Summary: | Pass  Total
Foo Tests     |    8      8

In the event that a nested test set has no failures, as happened here, it will be hidden in the summary. If we do have a test failure, only the details for the failed test sets will be shown:

julia> @testset "Foo Tests" begin
           @testset "Animals" begin
               @testset "Felines" begin
                   @test foo("cat") == 9
               end
               @testset "Canines" begin
                   @test foo("dog") == 9
               end
           end
           @testset "Arrays" begin
               @test foo(zeros(2)) == 4
               @test foo(fill(1.0, 4)) == 15
           end
       end

Arrays: Test Failed
  Expression: foo(fill(1.0, 4)) == 15
   Evaluated: 16 == 15
[...]
Test Summary: | Pass  Fail  Total
Foo Tests     |    3     1      4
  Animals     |    2            2
  Arrays      |    1     1      2
ERROR: Some tests did not pass: 3 passed, 1 failed, 0 errored, 0 broken.
@inferred f(x)

This macro checks that the call expression f(x) returns a value of the same type inferred by the compiler. It is useful to check for type stability.

f(x) can be any call expression. It returns the result of f(x) if the types match, and an Error result if it finds different types.

julia> f(a, b, c) = b > 1 ? 1 : 1.0
f (generic function with 1 method)

julia> typeof(f(1, 2, 3))
Int64

julia> @inferred f(1, 2, 3)
ERROR: return type Int64 does not match inferred return type Union{Float64, Int64}
Stacktrace:
[...]

julia> @inferred max(1, 2)
2
@test_logs [log_patterns...] [keywords] expression

Collect a list of log records generated by expression using collect_test_logs, check that they match the sequence log_patterns, and return the value of expression.

Examples

Consider a function which logs a warning, and several debug messages:

function foo(n)
    @info "Doing foo with n=$n"
    for i=1:n
        @debug "Iteration $i"
    end
    42
end

We can test the info message using

@test_logs (:info,"Doing foo with n=2") foo(2)

If we also wanted to test the debug messages, these need to be enabled with the min_level keyword:

@test_logs (:info,"Doing foo with n=2") (:debug,"Iteration 1") (:debug,"Iteration 2") min_level=Debug foo(2)

The macro may be chained with @test to also test the returned value:

@test (@test_logs (:info,"Doing foo with n=2") foo(2)) == 42
@test_deprecated [pattern] expression

When --depwarn=yes, test that expression emits a deprecation warning and return the value of expression. The log message string will be matched against pattern which defaults to r"deprecated"i.

When --depwarn=no, simply return the result of executing expression. When --depwarn=error, check that an ErrorException is thrown.

Examples

# Deprecated in julia 0.7
@test_deprecated num2hex(1)

# The returned value can be tested by chaining with @test:
@test (@test_deprecated num2hex(1)) == "0000000000000001"
@test_warn msg expr

Test whether evaluating expr results in stderr output that contains the msg string or matches the msgregular expression. If msg is a Boolean function, tests whether msg(output) returns true. If msg is a tuple or array, checks that the error output contains/matches each item in msg. Returns the result of evaluating expr.

@test_nowarn expr

Test whether evaluating expr results in empty stderr output (no warnings or other messages). Returns the result of evaluating expr.

If a test fails consistently it can be changed to use the @test_broken macro. This will denote the test as Broken if the test continues to fail and alerts the user via an Error if the test succeeds.

@test_broken ex
@test_broken f(args...) key=val ...

Indicates a test that should pass but currently consistently fails. Tests that the expression ex evaluates to false or causes an exception. Returns a Broken Result if it does, or an Error Result if the expression evaluates to true.

The @test_broken f(args...) key=val... form works as for the @test macro.

Examples

julia> @test_broken 1 == 2
Test Broken
  Expression: 1 == 2

julia> @test_broken 1 == 2 atol=0.1
Test Broken
  Expression: ==(1, 2, atol=0.1)

@test_skip is also available to skip a test without evaluation, but counting the skipped test in the test set reporting. The test will not run but gives a Broken Result.

@test_skip ex
@test_skip f(args...) key=val ...

Marks a test that should not be executed but should be included in test summary reporting as Broken. This can be useful for tests that intermittently fail, or tests of not-yet-implemented functionality.

The @test_skip f(args...) key=val... form works as for the @test macro.

Examples

julia> @test_skip 1 == 2
Test Broken
  Skipped: 1 == 2

julia> @test_skip 1 == 2 atol=0.1
Test Broken
  Skipped: ==(1, 2, atol=0.1)

Go's Unit Testing

import "testing"

In Go the package testing provides support for automated testing. It is intended to be used in concert with the go test command, which automates execution of any function of the form

func TestXxx(*testing.T)

where Xxx does not start with a lowercase letter. The function name serves to identify the test routine.

Test Suite

To write a new test suite, create a file whose name ends _test.go that contains the TestXxx functions. The file will be excluded from regular package builds but will be included when the go test command is run.

A simple test function looks like this:

func TestAbs(t *testing.T) {
    got := Abs(-1)
    if got != 1 {
        t.Errorf("Abs(-1) = %d; want 1", got)
    }
}

Benchmarks

Functions of the form

func BenchmarkXxx(*testing.B)

are considered benchmarks, and are executed by the go test command when its -bench flag is provided. Benchmarks are run sequentially.

A sample benchmark function looks like this:

func BenchmarkHello(b *testing.B) {
    for i := 0; i < b.N; i++ {
        fmt.Sprintf("hello")
    }
}

The benchmark function must run the target code b.N times. During benchmark execution, b.N is adjusted until the benchmark function lasts long enough to be timed reliably. The output

BenchmarkHello    10000000    282 ns/op

means that the loop ran 10000000 times at a speed of 282 ns per loop.

Examples

The package also runs and verifies example code. Example functions may include a concluding line comment that begins with Output: and is compared with the standard output of the function when the tests are run. (The comparison ignores leading and trailing space.) These are examples of an example:

func ExampleHello() {
    fmt.Println("hello")
    // Output: hello
}

func ExampleSalutations() {
    fmt.Println("hello, and")
    fmt.Println("goodbye")
    // Output:
    // hello, and
    // goodbye
}

The comment prefix Unordered output: is like Output:, but matches any line order:

func ExamplePerm() {
    for _, value := range Perm(4) {
        fmt.Println(value)
    }
    // Unordered output: 4
    // 2
    // 1
    // 3
    // 0
}

Skipping

Tests or benchmarks may be skipped at run time with a call to the Skip method of *T or *B:

func TestTimeConsuming(t *testing.T) {
    if testing.Short() {
        t.Skip("skipping test in short mode.")
    }
    ...
}

Subtests and Sub-benchmarks

The Run methods of T and B allow defining subtests and sub-benchmarks, without having to define separate functions for each. This enables uses like table-driven benchmarks and creating hierarchical tests. It also provides a way to share common setup and tear-down code:

func TestFoo(t *testing.T) {
    // <setup code>
    t.Run("A=1", func(t *testing.T) { ... })
    t.Run("A=2", func(t *testing.T) { ... })
    t.Run("B=1", func(t *testing.T) { ... })
    // <tear-down code>
}

Each subtest and sub-benchmark has a unique name: the combination of the name of the top-level test and the sequence of names passed to Run, separated by slashes, with an optional trailing sequence number for disambiguation.

The argument to the -run and -bench command-line flags is an unanchored regular expression that matches the test's name.

go test -run ''      # Run all tests.
go test -run Foo     # Run top-level tests matching "Foo", such as "TestFooBar".
go test -run Foo/A=  # For top-level tests matching "Foo", run subtests matching "A=".
go test -run /A=1    # For all top-level tests, run subtests matching "A=1".

Main

It is sometimes necessary for a test program to do extra setup or teardown before or after testing. It is also sometimes necessary for a test to control which code runs on the main thread. To support these and other cases, if a test file contains a function:

func TestMain(m *testing.M)

then the generated test will call TestMain(m) instead of running the tests directly. TestMain runs in the main goroutine and can do whatever setup and teardown is necessary around a call to m.Run. It should then call os.Exit with the result of m.Run.

A simple implementation of TestMain is:

func TestMain(m *testing.M) {
	// call flag.Parse() here if TestMain uses flags
	os.Exit(m.Run())
}

D's Unit Testing

Unit tests are a special function defined like:

unittest
{
    ...test code...
}

Individual tests are specified in the unit test using AssertExpressions. Unlike AssertExpressions used elsewhere, the assert is not assumed to hold, and upon assert failure the program is still in a defined state.

There can be any number of unit test functions in a module, including within struct, union and class declarations. They are executed in lexical order.

Unit tests, when enabled, are run after all static initialization is complete and before the main() function is called.

For example, given a class Sum that is used to add two values, a unit test can be given:

class Sum
{
    int add(int x, int y) { return x + y; }

    unittest
    {
        Sum sum = new Sum;
        assert(sum.add(3,4) == 7);
        assert(sum.add(-2,0) == -2);
    }
}

A unittest may be attributed with any of the global function attributes. Such unittests are useful in verifying the given attribute(s) on a template function:

void myFunc(T)(T[] data)
{
    if (data.length > 2)
        data[0] = data[1];
}

@safe nothrow unittest
{
    auto arr = [1,2,3];
    myFunc(arr);
    assert(arr == [2,2,3]);
}

This unittest verifies that myFunc contains only @safe, nothrow code.

Implementation Defined

  • If unit tests are not enabled, the implementation is not required to check the UnitTest for syntactic or semantic correctness. This is to reduce the compile time impact of larger unit test sections. The tokens must still be valid, and the implementation can merely count { and } tokens to find the end of the UnitTest's BlockStatement.
  • Use of a compiler switch such as -unittest to enable them is suggested.

Documented unittests allow the developer to deliver code examples to the user, while at the same time automatically verifying that the examples are valid. This avoids the frequent problem of having outdated documentation for some piece of code. If a declaration is followed by a documented unittest, the code in the unittest will be inserted in the example section of the declaration:

/// Math class
class Math
{
    /// add function
    static int add(int x, int y) { return x + y; }

    ///
    unittest
    {
        assert(add(2, 2) == 4);
    }
}

///
unittest
{
    auto math = new Math();
    auto result = math.add(2, 2);
}

The above will generate the following documentation:

class Math;

​ Math class

Example:

auto math = new Math; 
auto result = math.add(2, 2); 

int add(int x, int y);

​ add function

Example:

assert(add(2, 2) == 4); 

A unittest which is not documented, or is marked as private will not be used to generate code samples.

There can be multiple documented unittests and they can appear in any order. They will be attached to the last non-unittest declaration:

/// add function
int add(int x, int y) { return x + y; }

/// code sample generated
unittest
{
    assert(add(1, 1) == 2);
}

/// code sample not generated because the unittest is private
private unittest
{
    assert(add(2, 2) == 4);
}

unittest
{
    /// code sample not generated because the unittest isn't documented
    assert(add(3, 3) == 6);
}

/// code sample generated, even if it only includes comments (or is empty)
unittest
{
    /** assert(add(4, 4) == 8); */
}

The above will generate the following documentation:

int add(int x, int y); add function

Examples: ​ code sample generated

assert(add(1, 1) == 2);

Examples: ​ code sample generated, even if it is empty or only includes comments

/** assert(add(4, 4) == 8); */
@lydia-duncan
Copy link

For the Julia section

  • What are your thoughts on the inferred test macro? Given your knowledge of Chapel so far, do you think it will be desired by users? Why or why not?
  • I didn't see where collect_test_logs was first mentioned?
  • In the examples for test_logs, is the first example output ensuring that the info message matches? Can you test that only some of the debug messages are generated, or do you have to check that all of them are present?
    • Is this sort of information something you think you would want in our Chapel unit test setup? Why or why not?
  • For test_warn in the case where msg is a tuple or array, is the error output allowed to contain things that aren't listed in msg, if everything in msg is present?
  • Is test_broken something you would want in our Chapel unit test setup? Why or why not?
  • For test_skip, is this something you would want? Why or why not? If so, what semantics would you like it to have? When should it get run so that the test_broken "error on success" occurs?

@lydia-duncan
Copy link

For the Go section

  • Benchmark functions seem tricky to implement, but a way to collect performance information within the test framework. What do you anticipate would be done with the output? What would happen if the timing information or number of iterations were to change between test runs?
  • Where are the various timings described? (short, etc.)

@lydia-duncan
Copy link

What are your thoughts on the two strategies described so far?

@krishnadey30
Copy link
Author

What are your thoughts on the inferred test macro? Given your knowledge of Chapel so far, do you think it will be desired by users? Why or why not?

I feel that since the variable's type can change at runtime, and hence cannot be inferred at compile-time it is really necessary to check if what's getting returned and what the compiler has inferred are same or not.
As we know if no explicit return value type is supplied, the Chapel compiler infers the return value type so this macro will be highly useful. This will help developers to maintain consistency in return types. This will help us in case if we are storing the returned values.

I didn't see where collect_test_logs was first mentioned?

This is an internal function that collects the logs. You can find the source code here.
The documentation doesn't mention about it except here which creates confusion among users.

In the examples for test_logs, is the first example output ensuring that the info message matches? Can you test that only some of the debug messages are generated, or do you have to check that all of them are present?

Yes, it ensures that the info expected and the info generated are exactly the same. I have tested it. (gave n = 3 in debug although n=2 is sent as a parameter)
image
It checks that all debug messages are present or not.( removed Iteration = 2 )
image

Is this sort of information something you think you would want in our Chapel unit test setup? Why or why not?

I don't think this will be very helpful for chapel developers. Chapel is a parallel programming language and this macro checks the sequence in which logs are generated so this won't help much.
image

For test_warn in the case where msg is a tuple or array, is the error output allowed to contain things that aren't listed in msg, if everything in msg is present?

image
image

Is test_broken something you would want in our Chapel unit test setup? Why or why not?

Yes, I would like to include it in our unit test as there are lots of futures present in our current TestSystem and we can't ignore those and same goes for new futures.

For test_skip, is this something you would want? Why or why not? If so, what semantics would you like it to have? When should it get run so that the test_broken "error on success" occurs?

Yes, I think we should have test_skip as we might want to skip some tests and check only a few tests. But I think to have it in this way is not correct. We can have something like a parameter which takes the name of the test which we want to skip. Here we have to change the code for skipping a test and this is really hectic for a complex system like ours.

@krishnadey30
Copy link
Author

Benchmark functions seem tricky to implement, but a way to collect performance information within the test framework. What do you anticipate would be done with the output? What would happen if the timing information or number of iterations were to change between test runs?

I think the output is ideal and this is what expected generally i.e time taken per operation but the amount of memory consumed in each operation will be highly useful for developers and can be included as well.
I feel that for b.N we can give some maximum and minimum limit which the user can set. This can help the user to test how much time is taken when the operation is performed 1e5 time or 1e18 time and whether it is favorable to use the operation for less no of iterations or more iterations.

@lydia-duncan
Copy link

You've answered my questions and I don't have any further ones at this time :)

@krishnadey30
Copy link
Author

krishnadey30 commented May 22, 2019

Where are the various timings described? (short, etc.)

  • -short: Tell long-running tests to shorten their run time. It is off by default so a plain go test will do a full test of the package. But it is set during all.bash so that installing the Go tree can run a sanity check but not spend time running exhaustive tests.
    short() reports whether the -test.short flag is set.
  • -benchtime t: Run enough iterations of each benchmark to take t, specified as a time.Duration (for example, -benchtime 1h30s).
    The default is 1 second (1s). The special syntax Nx means to run the benchmark N times (for example, -benchtime 100x).
  • -count n: Run each test and benchmark n times (default 1). If -cpu is set, run n times for each GOMAXPROCS value. Examples are always run once.
  • -timeout d: If a test binary runs longer than duration d, panic. If d is 0, the timeout is disabled. The default is 10 minutes (10m).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment