Skip to content

Instantly share code, notes, and snippets.

@pjrt

pjrt/blog.md Secret

Created Feb 19, 2021
Embed
What would you like to do?

Auto derivation of instances is bad, stop implementing it in your libraries

Before we start though, we should specify what we mean by auto-derivation. Auto-derivation is often confused for plain old derivation, so here are some examples.

For these examples, assume that there is a typeclass defined as:

https://gist.github.com/081674c66cf595ff74b8dcb24f664a3f

This is plain old manual instance definition. There is no derivation of any kind since you have to manually implement f.

https://gist.github.com/0af73583d3a5d02adfcc1b4d58bddfca

This is derivation. It is NOT automatic, notice that you have to call deriveInstance, but you don't have to specify how it is derived. Some macro system is likely used underneath.

https://gist.github.com/76bf3a888c0651cbf83f76dad5d0b430

And finally, we have automatic derivation:

https://gist.github.com/9c31989b2563f0b8d0befe751d2d5b09

Notice that for automatic derivation, we never defined the instance, and yet when we call parse, an instance is found. This is actually very bad and we will explain why below.

How it works

For us to understand why it is bad, we need to understand how it works. If an instance is required (like when we call JsonParser.parse[User]), then the scala compiler will look for implicit instances in the following order:

  • Look at the local scope (locally defined instances, like within a def or class)
  • Look at the imports (instances imported, like import cats.implicits._)
  • Look at the companion object of the type being requested and the companion object of the typeclass being requested (like the implicit vals we defined in the companion objects of User, and the companion object of)

This is list is simplified as there are other special cases around objects extending traits, but that's outside of the scope of this post. Just know that a local instance will override (with no warning) an imported one, and an imported one will override one in a companion object, and there lies the issue.

How automatic derivations works

Automatic derivation libraries normally have a macro or some other system to automatically generate instances for types. There are two kinds of systems: automatic via companion types and automatic via an import: both of these systems are problematic.

Automatic derivation via companion object of the typeclass

This systems simply places the automatic derivation on the typeclass' companion object, like so:

https://gist.github.com/2bbd0803adb01e231842c69970aed291

The main problem here is that this will make all instances defined in the User's companion object conflict with it (with a compile error). Essentially forcing us to manually import the instances (since imports take precedence) whenever we want to manually define the instance of a type.

Automatic derivation via an import

Another popular system via imports. For this the user is required to import an auto package like so:

https://gist.github.com/725a492a105abcace78a3eef52ca1c6e

This system is both better and worse than the previous one. Better because you can just not import it, but worse because it can cause inconsistent behaviour. Remember that imports take precedence over companion objects, this means that if you create an instance on the companion object of User, JsonParser.parse[User] will use it. But if you purposely or accidentally import auto somewhere in you code, the behaviour of JsonParser.parse[User] will change! This can lead to JsonParser.parse[User] doing different things in different areas of the code. And since imports take precedence, there will be no warning or error.

Luckily a lot of libraries that use this system also export an semi package, which is the same thing but without the implicit (allowing you to use the derivation on demand).

What should we use?

Use manual derivation, always. If there is no option for manual derivation, then let the library's authors know about this issue (and maybe make a PR of your own!). Automatic derivation seems great at first, even magical, but eventually the business will require something that cannot be derived and must be manually defined. For example, an email address which you want to verify during parsing.

Assuming you have a json object {"email": "hello@example.com"}. Your business requires that you separate the hostname (example.com) from the user (hello) and you decide to do that during parsing.

https://gist.github.com/b8136d96f68f31f98997ed3528ccfa7d

With "Automatic derivation via companion object of the typeclass", this couldn't even be done, we would get a compile error. There are tricks we can do with objects and extensions of traits, but those tricks aren't reliable. With "Automatic derivation via an import" you would get inconsistent behaviour. Anywhere auto._ isn't imported will work as expected, but if auto._ is imported (purposely or accidentally), the behaviour will change from callsite to callsite.

But writing manual derivation is tedious

Well you are in luck, there are libraries that will ease the tediousness. In the DSS AdTech team, we've been using scalaz-deriving (https://github.com/scalaz/scalaz-deriving). That library takes a configuration file which makes typeclasses to their derivation functions. Allowing you to write something similar to what you would write in Haskell (where all of this originated):

https://gist.github.com/b41b3d86110a633bd5ed99500cb7e15b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment