Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save gbrennon/7857accfd252c284994249b97fa72bbd to your computer and use it in GitHub Desktop.
Save gbrennon/7857accfd252c284994249b97fa72bbd to your computer and use it in GitHub Desktop.
Article: Effective Scala Case Class Patterns - The guide I wished I had read years ago when starting my Scala journey

Effective Scala Case Class Patterns

Version: 2022.02.25a

Available As

Overview

This is the “Effective Scala Case Class Patterns” guide I wished I had read years ago when starting my Scala journey. There have been many hundreds of hours spent on futile tangents to earn the simplified actionable nuggets described below.


Because it is a fantastic integration of both OOP and FP, the case class is a key workhorse in any Scala software engineering project. Within Scala, it is primarily designed and intended (but not exclusively) to be used as an (immutable) FP Product Type.

Unfortunately, the default case class pattern…

    case class Longitude(value: Double)

…while terse, flexible, convenient, and extensively utilized, suffers from a number of issues which fall into the following categories:

  1. Extension confusion
  2. Elevated reasoning complexity
  3. Poor design for FP
  4. Future technical debt
  5. Security vulnerabilities

This article aims to propose several new boilerplate patterns that you can use to replace the defaults provided by the Scala compiler to address the above issue categories. I’ll step through an evolutionary process that will produce patterns of increasing detail, any of which can be an “upgrade” stage you might prefer.

DELAYED (bug in IntelliJ’s Template engine):

Additionally, I will be providing the patterns as IntelliJ File Templates making them as easy to instantiate while coding as using the Scala default.

Always Mark as Final

After many years of using case classes, it has become obvious extending it via inheritance is just a bad idea.

Thus, our very first pattern is to merely prepend final to the original example:

    final case class Longitude(value: Double)

This ensures no descendants, well-intended or malicious, can be defined inheriting from the case class where they inadvertently or purposefully abuse the “Liskov Substitution Principle”.

Implementing this desirably affects Overview categories 2 (Elevated reasoning complexity), 3 (Poor design for FP), 4 (Future technical debt), and 5 (Security vulnerabilities).

Reproducing Compiler Generated Code

One of the biggest frustrations a Scala newcomer faces is when they want to “enhance” a simple default case class by making the companion object explicit. For example, if one was to naively prepend the object as so…

    object Longitude {
    }
    final case class Longitude(value: Double)

All of the compiler-generated code in the “default companion object” just disappears. And code that was dependent upon it will now fail to compile (ex: the tupled method).

As a software engineer new to Scala, this can be quite confronting and disorienting. There isn’t any “official guidance” on how one might go about making the default companion object explicit. Googling for this isn’t trivial. Here’s how I explored this problem space on StackOverflow in 2014.

So, the solution is to extend the companion object with FunctionN so it looks like this:

    object Longitude extends (Double => Longitude) {
      def apply(value: Double): Longitude =
        new Longitude(value)
    }
    final case class Longitude(value: Double)

Scastie Snippet and IntelliJ Code Template Gist

To better understand this, here’s a 2013 post on StackOverflow where I had to explore this in more depth myself.

And this pattern will now be the basis upon which we fill out the remainder of the template.

Implementing this desirably affects Overview category 1 (Extension confusion).

Preventing the Creation of Instances Containing an Invalid State

One of the maxims in OOP’s DbC (Design by Contract - Eiffel originated) and FP is to prevent invalid states from being representable. If one successfully aims at and achieves this goal, it dramatically reduces the “guard” code (i.e. preconditions checks) for any clients making use of the case class instances.

A first pass naive implementation (which is also found and recommended in almost all Scala textbooks) is to use the require functionality within the case class’s constructor. It looks like this:

    final case class Longitude(value: Double) {
      require(value >= -180.0d, s"value [$value] must greater than or equal to -180.0d")
      require(value <= 180.0d, s"value [$value] must be less than or equal to 180.0d")
    }

Scastie Snippet and IntelliJ Code Template Gist

This implementation throws an exception at the first require that fails.

There are three problems with this approach:

  1. If there are other parameters that also need to be validated, it requires multiple passes to check other parameters which could have been correctly checked in a prior pass if multiple errors were allowed to be returned.
  2. The require implementation forces the client to deal with exceptions (avoid using exceptions for expected errors, like really). In most cases, the performance overhead of the exception infrastructure (both CPU effort and memory pressure/churn) is both significant and essentially unoptimizable (this remains contested). And even outside the poor performance reasons, FP implementations strongly prefer “error by value” as opposed to “error by exception”.
  3. It doesn’t allow the client to “check the preconditions” prior to instantiation. This mixing of concerns prevents optimization opportunities where the constructor, and therefore the memory allocation for the instance, is never invoked because the preconditions are already known to have not been met.

We can address all of these concerns in one fell swoop using a standard separation of concerns pattern.

First, we move all of the validation logic into its own method, generateInvalidStateErrors. Then, we ensure the apply method invokes the new operator and accepts/rejects the instantiation by first validating the passed parameter value(s). It should now look like this…

    object Longitude extends (Double => Longitude) {
      def generateInvalidStateErrors(value: Double): List[String] =
        if (value < -180.0d)
          List(s"value of value [$value] must be not be less than -180.0d")
        else
          if (value > 180.0d)
            List(s"value of value [$value] must be not be greater than 180.0d")
          else
            Nil

      def apply(value: Double): Longitude =
        generateInvalidStateErrors(value) match {
          case Nil =>
            new Longitude(value)
          case invalidStateErrors =>
            throw new IllegalStateException(invalidStateErrors.mkString("|"))
        }
    }
    final case class Longitude(value: Double)

Scastie Snippet and IntelliJ Code Template Gist

While the pattern is now in place, there is still a hole where a client can just use the new operator to bypass the apply method in the companion object. That is fixed by marking the case class constructor as private. That looks like this…

    final case class Longitude private(value: Double)

Scastie Snippet and IntelliJ Code Template Gist

It looks like we’re done, right?

Oops! Sneaky attack vectors ahead!

It turns out there are two other compiler-generated constructor pathways we must address

  1. readResolve method - Supports the compiler-generated Serializable interface. This is especially pernicious it instantiates the memory for the case class, and then directly injects the (possibly malicious) deserialized contents into the instance’s memory. This completely bypasses both the apply method and the object constructor. And this means no validation takes place whatsoever.
  2. copy method - Uses the new operator, and can do so because the method is within the private scope of the constructor. This bypasses the validation we moved into the companion object and invoke via the apply method.

In each of these cases, we want to reroute the method to the companion object’s apply method. It should look like this…

    final case class Longitude private(value: Double) {
      private def readResolve(): Object =
        Longitude(value)

      def copy(value: Double = value): Longitude =
        Longitude(value)
    }

Scastie Snippet and IntelliJ Code Template Gist

If you know the case class will never be used anywhere that utilizes Java serialization, then feel free to remove the readResolve method.

While I, too, hate Java Serialization, remember many platforms, including those like Akka, Kafka, and Spark, continue to depend upon Java serialization. And this means when they do so, if the readResolve method is missing, you’ve left your case class open to a malicious attack that bypasses your case class’s immutable invariant encoded in the precondition check implemented in the generateInvalidStateErrors method.

We have now ensured there are no reasonable ways to instantiate this case class without going through the precondition check (validation of state prior to invoking instantiation overhead). There are pathological pathways that can be used that involve the illicit use of the Java reflection API, and there is no real way for us to protect against those.

The fully expressed pattern should now look like this…

    object Longitude extends (Double => Longitude) {
      def generateInvalidStateErrors(value: Double): List[String] =
        if (value < -180.0d)
          List(s"value of value [$value] must be not be less than -180.0d")
        else
          if (value > 180.0d)
            List(s"value of value [$value] must be not be greater than 180.0d")
          else
            Nil

      def apply(value: Double): Longitude =
        generateInvalidStateErrors(value) match {
          case Nil =>
            new Longitude(value)
          case invalidStateErrors =>
            throw new IllegalStateException(invalidStateErrors.mkString("|"))
        }
    }
    final case class Longitude private(value: Double) {
      private def readResolve(): Object =
        Longitude(value)

      def copy(value: Double = value): Longitude =
        Longitude(value)
    }

Scastie Snippet and IntelliJ Code Template Gist

Implementing this desirably affects Overview categories 2 (Elevated reasoning complexity), 4 (Future technical debt), and 5 (Security vulnerabilities).

Adding an Error-By-Value Constructor

The default strategy with case classes is to use “error by exception”. It is what using require is. If the Boolean condition is false, it throws an exception wrapping the error string you provide.

From a proper FP design perspective, exceptions are considered a poor way to manage known error conditions, like a case class’s preconditions. Exceptions are acceptable for exceptional things like running out of memory or opening a database connection. However, they should be avoided when the error is just part of the method’s domain.

For example, it is an inappropriate use of exceptions for a square root method to use an exception when passing a negative number. The square root method should be defined to return either an error (String) if the input number is negative, or the actual result if the number is positive.

To add “error by value”, we will an additional applyE method (where E is for Error) which uses an Either to cover both the correct and the erred input parameter cases. The method looks like this…

    def applyE(value: Double): Either[List[String], Longitude] =
      generateInvalidStateErrors(value) match {
        case Nil =>
          Right(new Longitude(value))
        case invalidStateErrors =>
          Left(invalidStateErrors)
      }

This looks remarkably similar to the apply method. In fact, it is so similar, it is essentially code duplication. So, to remove code duplication, we will reimplement the apply method to use the applyE method which now looks like this…

    def apply(value: Double): Longitude =
      applyE(value) match {
        case Right(longitude) =>
          longitude
        case Left(invalidStateErrors) =>
          throw new IllegalStateException(invalidStateErrors.mkString("|"))
      }

Scastie Snippet and IntelliJ Code Template Gist

Implementing this desirably affects Overview category 3 (Poor design for FP).

Adding Memoization/Caching

With this new pattern in place, we have now ensured all precondition checking travels through a single method. And the same with instantiation. Assuming immutability has been retained, it has made trivial adding a memoization (a.k.a. caching) strategy.

Here’s an example of the companion object modified to incorporate memoization.

    object Longitude extends (Double => Longitude) {
      private var cachedInvalidStateErrorss: Map[Double, List[String]] = Map.empty
      private var cachedInstances: Map[Double, Longitude] = Map.empty

      def generateInvalidStateErrors(value: Double): List[String] = {
        cachedInvalidStateErrorss.get(value) match {
          case Some(invalidStateErrors) => invalidStateErrors
          case None =>
            val invalidStateErrors =
              if (value < -180.0d)
                List(s"value of value [$value] must be not be less than -180.0d")
              else if (value > 180.0d)
                List(s"value of value [$value] must be not be greater than 180.0d")
              else
                Nil
            val newItem = (value, invalidStateErrors)
            cachedInvalidStateErrorss = cachedInvalidStateErrorss + newItem
            invalidStateErrors
        }
      }

      …

      def applyE(value: Double): Either[List[String], Longitude] =
        generateInvalidStateErrors(value) match {
          case Nil =>
            Right(
              cachedInstances.get(value) match {
                case Some(longitude) => longitude
                case None =>
                  val longitude = new Longitude(value)
                  val newItem = (value, longitude)
                  cachedInstances = cachedInstances + newItem
                  longitude
              }
            )
          case invalidStateErrors =>
            Left(invalidStateErrors)
        }
    }

Scastie Snippet and IntelliJ Code Template Gist

The memoization strategy shown in the above code snippet is for EXAMPLE PURPOSES ONLY because it’s a terrible default strategy.

Please use one of the many other options available. And specifically, investigate ScalaCache. It is a great generalized caching library that allows choosing between different specialized backing implementations.

Tips & Tricks

Summary

Even if you find some of the above “boilerplate” undesirable, I hope you enjoyed and learned something about case classes such that it makes them more useful to you in your future Scala software engineering challenges.

Contact

jim.oflaherty.jr@gmail.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment