Skip to content

Instantly share code, notes, and snippets.

@mariadroman
Last active September 9, 2016 14:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mariadroman/816d5b6848592aaacfa722c372dbccba to your computer and use it in GitHub Desktop.
Save mariadroman/816d5b6848592aaacfa722c372dbccba to your computer and use it in GitHub Desktop.

A journey to ScalaCheck

From the Spanish good weather to the Dutch every-possible-weather-in-one-day, from Waterfall to Agile, from just testing to property based testing. The path I took when joining Lunatech was an interesting one. I want to share some of that and show how my journey to ScalaCheck started. I will prove that it's not complicated to start with and it can uncover deeply hidden bugs in your code.

As developers, we need to be sure that we create code that performs exactly how it is meant to. This should be true in every possible scenario. However, how can we prove that our codebase actually does this for a wide range of data? Sometimes it is just not feasible to write innumerable amount of test cases for a specific function. We need to find a way to somehow prove our function works as expected in every possible case.

Property-based testing provides another way of thinking, that was new to me, about writing tests. Sometimes it is better to prove that a function satisfies a specific property, rather than to write a number of tests which try to confirm it is working fine. One way of proving is to generate an appropriate amount of data and apply these data to your test suite. These generated data should all have the same specific property, hence the name property-based testing.

As an example, imagine we want to test String concatenation. To do this we need to be sure that: For all given two strings, str1 and str2, the result of concatenating both strings must satisfy: str1.length + str2.length >= str1.length

Traditionally, we would write a test like:

test("Concatenate should generate a String of length s1+s2") {
	concatenate("", "").length == 0
	concatenate("Hello, ", "world.").length == 13 //Hello, world.
	concatenate("Welcome to ","Lunatech.").length == 20 //Welcome to Lunatech.
}

But testing all possible combinations of 2 strings is impractical this way. In these cases, ScalaCheck is the recommended solution.

Let's first understand the basic concepts in ScalaCheck: Properties and Generators.

Properties

In ScalaCheck you can specify what the input parameters are and what their properties are that must be satisfied by the input. It uses a very elegant and intuitive way for defining properties:

property("Concatenation length of two strings should be greater or equal to length of first string") = forAll { (s1: String, s2: String) =>
  (s1 + s2).length >= s1.length
}

In this small piece of code, we declare a property ("Concatenation length of two strings ..."), that holds forAll possible cases of concatenating 2 strings (s1 and s2). This seems reasonable but how can we prove what this property holds true. One way is by creating a lot of tests. And that is where Generators come in handy.

Generators

To generate this input data, ScalaCheck provides us with a wide range of generators available in objects Arbitraty and Gen.

The org.scalacheck.Arbitrary module defines implicit Arbitrary instances for common types, for convenient use in your properties and generators:

  • arbitrary[T]: returns an arbitrary generator for the type T

The org.scalacheck.Gen uses Arbitrary and offers various generators:

  • alphaLowerChar, alphaUpperChar, alphaNumChar
  • identifier, alphaStr, numStr
  • negNum, posNum, chooseNum
  • listOf, listOfN, nonEmptyListOf
  • choose, oneOf, someOf
  • const

Some examples using arbitrary/generators:

id <- arbitrary[Int]
married <- arbitrary[Boolean]
age <- choose(0, 120)
currency <- const("euro")
description <- arbitrary[String]

However, most of the time we do not want to check such a general data type. For this, ScalaCheck also offers the possibility of defining custom generators where we can establish what the input data should look like.

Let's use a simple example to understand the usage of custom generators. Imagine we are a Benelux bank that wants to verify that their Dutch customers who have a negative balance in at least one of their accounts, should be notified by email. For simplicity, we define customer and bank account as below:

case class Account(accountId: String, balance: Double, country: String)
case class Customer(customerId: String, name: String, nationality: String, accounts: Seq[Account])

So first we want to generate Account data. To do this, we make use of Arbitrary and Gen. Because we are only interested in Benelux accounts the country field will be one of "BE", "NL" or "LU"

// Account generator - only Benelux accounts
val genAccount = for {
	accountId <- Gen.identifier
	balance <- arbitrary[Double]
	country <- Gen.oneOf("NL", "BE", "LU")
} yield Account(accountId, balance, country)

As a next step, we generate customer data. Because we are only interested in Dutch clients the nationality of the customers will be forced to be always "NL"

// Forcing customers to be Dutch will be as easy as:
val genDutchCustomer = for {
	customerId <- Gen.identifier
	name <- arbitrary[String].suchThat(_.nonEmpty)
	nationality <- Gen.const("NL")
	accounts <- nonEmptyListOf(genAccount)
} yield Customer(customerId, name, nationality, accounts)

Finally, from the Dutch customers, we are interested on those having at least one account with negative balance

// Forcing customer to be Dutch and having negative balance:
val genDutchInRed = for {
	customer <- genDutchCustomer.suchThat(_.accounts.exists(_.balance < 0))
} yield customer

Something which is worth mentioning at this point is the usage of .suchThat. It is recommended not to write very restrictive conditions in this filter, because ScalaCheck first generates all input data, and filters it later based on the condition provided. If the condition is too restrictive, it may end up with too many inputs discarded and the tests will not run.

To conclude with generators, let's have a look to a sample of our Dutch customer with at least one account with negative balance:

scala> genDutchInRed.sample
res0: Option[Customer] = Some(
		  Customer(uhcamdsjupssGeVftisrdb86mfbzflr,
			  秓鲛헀뀧���锣ﰖ㼽혋ᑀ槵ݒ셡솉嚿Ӕڸ傄펽狂籘﫩帕咜኿�蘞��뉟᧨Ẋ뒯ᷔᴎ凟�伓䋨繗ユ④会枺峸裔⇺寜犼ꇄ輆狊篩뗞♧랃ⶪ㫒ꎙ툥즩,
			  NL,
			  List(
					Account(onScof2s4kBuphlrsal5ldWdh0oqbqbpgt03Snnrpryvlvzs89tnkh3fkreSsuoue0ntesrSlrpvDo7a4pe6bbqDly4cox,1.875359772688297E94),
					Account(yksznv4f48xezgep0daoyqtztcvruezwm,-3.9701238543851655E178),
					Account(uezzrfUxtbqPywvkXPbezZqtuX,4.8011482377734943E179),
					Account(htnlbxvtnDxiptwojhy4n36mzz2uovy5Xljoxgznkqomsk4rlhAxc9z6ebcwi6eMdnsass4cjhaerHfamcvzz0h6wtqn0pdgo6,6.04591158308268E-244),
					Account(s,-1.5255297073815315E-254),
					Account(vubpajf828dewljoarfp2uu0t9i3idnzhgDvjyediqyfax2fkfO6gAtgDqqNgaxkacswrcTzWpwkoopqt,-1.868869258123239E-125),
					Account(guukirryuthlx4ejvhym6bVdiv8lleylBVfEkvslcvUskjlpzagtm2clfx4ashzdFQQWW,1.519776982857599E-66)
			  )
			)
		)

This shows us that maybe we should add some conditions to the accountId or the balance, because it is not normal to deal with such values in real life. This was for example one of the reasons to create scalacheck-datetime

Writing tests

Now that we are familiar with properties and generators, it is time to write tests. We have good examples in the Scala community, because ScalaCheck is used by many Scala open source projects (like Akka or Play).

In this case, we will continue with our concatenate example.

import org.scalacheck.Properties
import org.scalacheck.Prop.forAll

class StringProperties extends Properties("String Properties") {
  property("Concatenation length equal or greater than zero") = forAll { (s1: String, s2: String) =>
    s1.length + s2.length >= 0
  }
  property("Concatenation length equal to length addition") = forAll { (s: String) =>
    val len = s.length
    (s + s).length == len + len
  }
}

Our properties file can be as simple as that, or we can make it as complicated as we need. We can also integrate it with ScalaTest or Specs2.

Running ScalaCheck tests

Using sbt, we run ScalaCheck tests in the same way we run ScalaTest tests: sbt test:compile test. If our code is correct and all the tests generated by ScalaCheck are successful, we can see the following as output:

+ String Properties.Concatenation length equal to length addition: OK, passed 100 tests.
+ String Properties.Concatenation length equal or greater than zero: OK, passed 100 tests.
ScalaCheck
Passed: Total 2, Failed 0, Errors 0, Passed 2

By default, ScalaCheck generates 100 tests per property, which must be satisfied for the test to pass.

In case a property is not satisfied by the generated test data, ScalaCheck yields an error. And not only shows the input data which makes the property to fail, but it also simplifies as much as possible to show you the minimum value which makes the test to fail. This helps us a lot when going back to the code and applying a solution to fix the wrong implementation.

How ScalaCheck helps with finding bugs

If you are not yet convinced we'll give you another example of code that looks fine at first glance, but will not meet the requirements.

property("Absolute value should not be negative") = forAll { (input: Int) =>
	input.abs >= 0
}

Looks reasonable, if we apply abs to a number, we will get a positive one (or zero). But... voilà! Here it is what ScalaCheck yields after running the test:

! String Properties.absolute value should not be negative: Falsified after 1 passed tests.
> ARG_0: -2147483648
ScalaCheck
Failed: Total 1, Failed 1, Errors 0, Passed 0

What ScalaCheck is showing is that the property fails for input = -2147483648. Then, we realize that Int numbers are not symmetric. Int.MaxValue = 2147483647 Int.MinValue = -2147483648 So, when trying to apply abs to Int.MinValue, we get Int.MinValue.abs = -2147483648, which does not satisfy the condition of input.abs >= 0.

It is very likely that we write our code without thinking about these kind of corner cases, because we probably never expect an input with value -2147483648. But since -2147483648 is valid input, our code will accept it and will crash if we do not add conditions to prevent it.

ScalaCheck focuses mainly on corner cases, where our functions are more sensible to fail. So for Int values, it will first test with MIN_VALUE, MAX_VALUE and 0; for String values will test with symbols and non-roman alphabet.

Useful links to get started

Summary

  • When you feel you are adding many tests based on input data, stop for a moment and think twice about the possibility of translating the functionality into a property that ScalaCheck can test for you.
  • If we can write properties for a given function, ScalaCheck provides an easy and very intuitive way of writing tests, which automatically generate large amounts of data for us, mainly focusing on corner and special cases.
  • It is very helpful that ScalaCheck shrinks test cases to the minimal case.
  • ScalaCheck does NOT substitute ScalaTest or Specs2, but it complements them with property testing.
  • Don't forget that ScalaCheck is generating a finite number of tests, which means that there is always a chance that within this randomized set of tests, a bug might not be found (although it does exist in your code). However in case your input type is more constrained e.g. Byte, it can even generate all possible input data.

I started with ScalaCheck soon after I started with Scala and it changed the way I look at tests. Be always open to explore and try new options, because from all of them you will always learn something useful.

@howard-wang-nl
Copy link

Very readable and enlightening examples. Kudos!

@vijaykiran
Copy link

Very well Written 👍

I've some suggestions 😉

"Let's understand first the basic" -> "Let's first understand the basic ..."
" what the properties are " -> "what their properties are ..."
" by the input data" -> "by the input".
"this property holds" -> "this property holds true"
"ScalaCheck generates first all" -> "ScalaCheck first generates all"
"the same way we are running" -> "the same way we run"
"we can see as output" -> "we can see the following as output"
"must be satisfied so the test will pass" -> "must be satisfied for the test to pass"
"when coming back to the code and apply " -> " when going back to the code and applying"
"for fixing the wrong implementation" -> "to fix the wrong implementation"

@mariadroman
Copy link
Author

Thanks for the comments!! ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment