BalmungSan/Polymorphism.md

## Polymorphism.md

      
    Raw
  

              Polymorphism.md
            
          
    Polymorphism in Scala

This document aims to show and compare three alternatives
for achieving polymorphism in Scala.

Subtyping, common in object-oriented languages like Java.
Duck typing, common in dynamically typed languages like Python.
Typeclasses, common in functional languages like Haskell.

Additionally, when implementing the typeclass pattern in Scala,
we will make a simile with its implementation in Haskell.

Introduction

Let's start by defining the core motivation of this document,
Polymorphism:

"In programming languages and type theory,
polymorphism is the provision of a single interface
to entities of different types".
— Wikipedia

The main advantage of polymorphism is that it allow us
to write generic functions that can operate on different types,
leading to code reuse.
Another advantage of polymorphism is that by abstracting over
concrete types, and using generic interfaces instead,
we restrict ourselves to what we can do in our functions.
Thus limiting the number of possible implementations,
which results in cleaner code.
Perhaps, the simplest example of polymorphism is the identity function:
def id[T](t: T): T = t, which clearly shows the two advantages mentioned above.

A more complicated example would be:
Supposed an interface Printer,
which can create a text representation of a value.
Then, we could log any instance of any type T to a file,
as long as T implements the Printer interface.

Problem definition

To show how to use polymorphism in Scala using the three chosen techniques,
we decided to split the process of using polymorphic functions
in four steps:

The definition of the interface and the operations it provides.
The implementation of the interface for a particular type.
The use of the interface in a polymorphic function.
The call of the polymorphic function with an instance
of a particular type which implements the interface.

To allow an objective comparison,
the same interfaces, the same types and the same functions,
will be used on all three techniques.
Interfaces

We will define two interfaces, both intended for collection-like types (C[T]).
The first one provides a filter function, which takes a collection
and a predicate (T => Boolean) as arguments, and returns a new collection
with only the elements of the original collection to which the predicate holds.
The second one provides a filterOption function, which takes the same arguments of the
filter function but returns a collection of optional values (C[Option[T]]) where
the values that satisfy the predicate are preserved as a Some
and the ones which not are replaced with a None.

And an unNone function, which takes as argument a collection of optional values
and returns a new collection with all Nones removed and all Somes extracted.
Note: Both, filter & unNone could remove all elements from the collection.
As such, each C[T] must be able to represent an empty collection.
It is worth mentioning that the second interface "extends" the first one, in the sense that
the filter function can be implemented in terms of the filterOption & unNone functions.
Which implies that every instance of the second interface is also an instance of the first one.
Types

For implementing the two previous interfaces,
we will use a simple ADT for a binary tree.
enum BinaryTree[+T]:
  case Leaf(value: T)
  case Branch(left: BinaryTree[T], right: BinaryTree[T])
  case EmptyTree
Also, if possible, the List[T] type from the standard library
(List already defines a filter method).
Functions

We will use the two previous interfaces,
to create two polymorphic functions onlyOdds & onlyOddsOption,
which will take a collection of numbers (C[Int])
and will return a collection with only odds
& with odds wrapped in a Some and evens replaced by Nones respectively.
def onlyOdds(c: C[Int]): C[Int] =
  c.filter(x => (x % 2) != 0)

def onlyOddsOption(c: C[Int]): C[Option[Int]] =
  c.filterOption(x => (x % 2) != 0)

Subtyping

When using subtyping, the common way for achieving polymorphism is by
creating an abstract class / interface that defines the methods
that can be executed on instances of subclasses of itself.

Then, when defining new types, one must explicitly extend the interface
and override the methods this one provides.

Latter, when writing a polymorphic function, one uses the interface
as the type of its arguments and return.

Finally, when we call our polymorphic function,
with an instance of our new type.
The compiler will check if the value conforms to the expected type.
Which it does, since as a subclass of the interfce, it is also a subtype.
Interface definition

trait Filterable[+T]:
  def filter(predicate: T => Boolean): Filterable[T]

trait OptionalFilterable[+T] extends Filterable[T]:
  override def filter(predicate: T => Boolean): OptionalFilterable[T] =
    this.filterOption(predicate).unNone

  def filterOption(predicate: T => Boolean): OptionalFilterable[Option[T]]

  def unNone[U](using T <:< Option[U]): OptionalFilterable[U]
Note: The ev parameter of the unNone method is called a
generalized type constraint.
It is used to specify that the unNone method can only be called,
if the type parameter of this OptionalFilterable is an Option.
Interface implementation

enum BinaryTree[+T] extends OptionalFilterable[T]:
  case Leaf(value: T)
  case Branch(left: BinaryTree[T], right: BinaryTree[T])
  case EmptyTree

  // Override the default implementation of filter for performance improvements.
  override final def filter(predicate: T => Boolean): BinaryTree[T] =
    this match
      case EmptyTree =>
        EmptyTree

      case Leaf(value) =>
        if predicate(value) then
          Leaf(value)
        else
          EmptyTree

      case Branch(left, right) =>
        (left.filter(predicate), right.filter(predicate)) match
          case (EmptyTree, EmptyTree) =>
            EmptyTree

          case (Leaf(value), EmptyTree) =>
            Leaf(value)

          case (EmptyTree, Leaf(value)) =>
            Leaf(value)

          case (left, right) =>
            Branch(left, right)
  end filter

  override final def filterOption(predicate: T => Boolean): BinaryTree[Option[T]] =
    this match
      case EmptyTree =>
        EmptyTree

      case Leaf(value) =>
        if predicate(value) then
          Leaf(Some(value))
        else
          Leaf(None)

      case Branch(left, right) =>
        Branch(
          left  = left.filterOption(predicate),
          right = right.filterOption(predicate)
        )
  end filterOption

  override final def unNone[U](using T <:< Option[U]): BinaryTree[U] =
    this match
      case EmptyTree =>
        EmptyTree

      case Leaf(valueOption) =>
        // Use the implicit evidence to cast
        // valueOption from type T
        // to type Option[U]
        // for pattern-matching against it.
        (valueOption : Option[U]) match
          case None =>
            EmptyTree

          case Some(value) =>
            Leaf(value)

      case Branch(left, right) =>
        (left.unNone, right.unNone) match
          case (EmptyTree, EmptyTree) =>
            EmptyTree

          case (Leaf(value), EmptyTree) =>
            Leaf(value)

          case (EmptyTree, Leaf(value)) =>
            Leaf(value)

          case (left, right) =>
            Branch(left, right)
  end unNone
end BinaryTree
Interface usage - function definition

def onlyOdds(c: Filterable[Int]): Filterable[Int] =
  c.filter(x => (x % 2) != 0)

def onlyOddsOption(c: OptionalFilterable[Int]): OptionalFilterable[Option[Int]] =
  c.filterOption(x => (x % 2) != 0)
Interface usage - function calling

// BinaryTree.
val tree: BinaryTree[Int] =
  Branch(
    left  = Branch(
      left = Leaf(value = 10),
      right = Leaf(value = 15)
    ),
    right = Branch(
      left = Leaf(value = 5),
      right = EmptyTree
    )
  )
// tree: BinaryTree[Int] = Branch(Branch(Leaf(10),Leaf(15)),Branch(Leaf(5),EmptyTree)

onlyOdds(tree)
// res: Filterable[Int] = Branch(Leaf(15),Leaf(5))

onlyOddsOption(tree)
// res: OptionalFilterable[Option[Int]] = Branch(Branch(Leaf(None),Leaf(Some(15))),Branch(Leaf(Some(5)),EmptyTree))

// List.
onlyOdds(List(10, 15, 20, 25))
// error: type mismatch;
//  found   : List[Int]
//  required: Filterable[Int]

onlyOddsOption(List(10, 15, 20, 25))
// error: type mismatch;
//  found   : List[Int]
//  required: OptionalFilterable[Int]
As can be seen above, it works very well for our BinaryTree, but fails for List.
This is obviously because List does not extends our abstract class.
And since we can not modify its source code to do so, we can never use a List,
or any other external class, on a function using our interfaces.
Takeaways

Subtyping polymorphism works... but has a lot limitations/problems.


In Scala we can only extend from one class.
Thus, we cannot implement methods from more than one abstraction.
However, if instead of using an abstract class, we would have used
a trait, this limitation would have been overcome.


As can be seen in the function calling section above,
when calling the onlyOdds & onlyOddsOption functions,
we lose type information!
The types of the results changed from BinaryTree[Int] to
Filterable[Int] & OptionalFilterable[Int] respectively.
Such thing can be problematic because, given the functional nature of Scala,
it is encouraged to be immutable and it is expected that the interfaces' methods
will be implemented by returning modified copies of themselves, instead of mutating.
Thus, when we lose the information about the concrete type of our result,
we lose the ability to perform operations specific to them.
However, this can be fixed using f-bounded types,
by making our methods to return the appropriate subclass instead of the interface itself.


It is impossible to implement our interfaces for types that are outside of our control.
Thus, any polymorphic functions written using those interfaces can not be used with those types.


Duck typing

Duck typing is a technique common in dynamic typed languages, on which the duck test
("If it walks like a duck and it quacks like a duck, then it must be a duck")
is applied to determine the object's suitability for a polymorphic function.
In other words, the idea is to simple write a polymorphic function
that accepts any input, as long as it (somehow) implements
the methods required by the function.
Scala provides a more typesafe way to achieve duck typing,
called Structural Types.
In this form of polymorphism we don't define a formal interface,
but rather a set of methods that we would need.

Then, we don't need to make anything special when defining a new type,
since there is no interface to implement, we just code the methods
we believe make sense for our type.

Later, when writing polymorphic functions,
we specify that it will work for any type T,
as long as it provides the methods that are needed.

Finally, when we call the function the compiler will inspect
(using compile time reflection) the supplied value to check
if it provides the methods required by the function.
Thus, it is typesafe! However, the program uses runtime reflection
for calling those methods.
Interface definition

For doing so, we define a type alias to encode the methods we would require later on.
type IsFilterable[T, C[_]] = {
  def filter(predicate: T => Boolean): C[T]
}

type IsOptionalFilterable[T, C[_]] = IsFilterable[T, C] {
  def filterOption(predicate: T => Boolean): C[Option[T]]
}
Interface implementation

Here the implementation is very similar to the previous one.

However, in this case we do not extend any interface, thus
we do not override any method, we only define them.
Note: Since the implementation of the methods
(namely, the match statements)
is the same, we decided to omit it to save space.
enum BinaryTree[+T]:
  case Leaf(value: T)
  case Branch(left: BinaryTree[T], right: BinaryTree[T])
  case EmptyTree

  def filter(predicate: T => Boolean): BinaryTree[T] =
    // ...
  end filter

  def filterOption(predicate: T => Boolean): BinaryTree[Option[T]] =
    // ...
  end filterOption
end BinaryTree
Interface usage - function definition

import reflect.Selectable.reflectiveSelectable // Required to call methods from the structural types.

def onlyOdds[C[_]](c: IsFilterable[Int, C]): C[Int] =
  c.filter(x => (x % 2) != 0)

def onlyOddsOption[C[_]](c: IsOptionalFilterable[Int, C]): C[Option[Int]] =
  c.filterOption(x => (x % 2) != 0)
Note: that the onlyOddsOption function ask for more methods that the one it really needs.
When using duck typing the common is to only ask for what is needed.
Interface usage - function calling

// BinaryTree.
val tree: BinaryTree[Int] =
  Branch(
    left  = Branch(
      left = Leaf(value = 10),
      right = Leaf(value = 15)
    ),
    right = Branch(
      left = Leaf(value = 5),
      right = EmptyTree
    )
  )
// tree: BinaryTree[Int] = Branch(Branch(Leaf(10),Leaf(15)),Branch(Leaf(5),EmptyTree)

onlyOdds(tree)
// res: BinaryTree[Int] = Branch(Leaf(15),Leaf(5))

onlyOddsOption(tree)
// res: BinaryTree[Option[Int]] = Branch(Branch(Leaf(None),Leaf(Some(15))),Branch(Leaf(Some(5)),EmptyTree))

// List.
onlyOdds(List(10, 15, 20, 25))
// res: List[Int] = List(15, 25)

onlyOddsOption(List(10, 15, 20, 25))
// error: type mismatch;
//  found   : List[Int]
//  required: IsOptionalFilterable[Int, List]
As can be seen above, it worked for our BinaryTree
and in this case the onlyOdds function worked too for List,
since there is a filter method defined in it.
Takeaways


Structural Types


Pros
Cons


Structural Types provide a simple way to write polymorphic functions
that relay in a few methods which are common across many types.


However, they becomes inconvenient and troublesome
when you depend on more than a couple of methods.


The example above showed that using structural types,
we can have good return types.


However, it also showed that getting this can be a little bit complex.

But, the main problem with this is that it is not completely correct,
more precisely, there is no guarantee (at interface level) that
the C[_] returned is the same type of the type
on which the method was called.


Since in this case the interface only defines
the names and signatures of a set of methods that must exist for a given type.
It can be used with any external type, as long as such type defines those methods.


However, since they must exactly match the name and signature,
it becomes really restrictive.

For example, suppose there exits an external PairList type,
which also defines a filter.
But in this case. instead of taking a predicate of type T => Boolean,
it takes one of type (T1, T2) => Boolean,
given that we can not use it with our IsFilterable abstraction.

Another problem with Structural Types it is that,
there is no conscious process of implementing the interface.


The compiler guarantees that any value passed to the function
has all the required methods.
Thus, it is typesafe and will not fail in runtime.


However, it does perform reflection calls in runtime
to execute the methods on the passed value.
Thus, impacting performance and non-portable to other environments,
like JS or Native.


Typeclasses

Typeclasses is a technique common in functional languages,
where the idea is to split the definition of a type
from the implementation of an interfaces for such type.
For implementing the typeclass pattern in scala, one must:


Define an abstract interface.
Which, instead of defining methods to be executed on itself,
defines functions that take the value on which the
operation will be applied as another parameter.


Prove that a particular type is an instance of the typeclass.
That is, implement all abstract functions for such type.


Then, when writing a polymorphic function one says that it works for any type T,
as long as there is an instance of the typeclass for that type T.
Note: The Haskell examples where made following the
typeclasses chapter of:
"A gentle introduction to Haskell". Also we use the
intrinsic superclasses
package to provide automatic derivation of superclasses instances.
Interface definition

Scala

trait Filter[C[_]]:
  // Define filter as an extension method.
  extension [T](c: C[T])
    def filter(predicate: T => Boolean): C[T]

// We take advantage of sub-typing to specify
// that every instance of OptionalFilter
// is also an instance of Filter.
trait OptionalFilter[C[_]] extends Filter[C]:
  extension [T](c: C[T])
    // Provide a default implementation for filter.
    override def filter(predicate: T => Boolean): C[T] =
      c.filterOption(predicate).unNone

    def filterOption(predicate: T => Boolean): C[Option[T]]

 
  // We can explicitly state that unNone
  // only works for collections of optional values.
  extension [T](c: C[Option[T]])
    def unNone: C[T]
Haskell

{-# language TemplateHaskell #-}

module Filters where

-- Required to provided default superclasses instances.
import Language.Haskell.TH.Instances.Defaults

class Filter c where
  filter :: (c t) -> (t -> Bool) -> (c t)

class (Filter c) => MaybeFilter c where
  filterMaybe :: (c t) -> (t -> Bool) -> (c (Maybe t))

  unNothing :: (c (Maybe t)) -> (c t)

  -- Default implementation of the filter method.
  filterDefault :: (c t) -> (t -> Bool) -> (c t)
  filterDefault ct p = unNothing (filterMaybe ct p)

-- Annotation to enable automatic generation of a Filter instance given a MaybeFilter instance.
{-# ANN type MaybeFilter (Defaults 'Filters.filter 'filterDefault) #-}
Interface implementation

In this case we define our type,
and the implementation of the interface separately.

However, this time the type signature of the unNone function
explicitly states that it only works for collections of options.
Thus, there is no need to get and use the implicit evidence.
Scala

enum BinaryTree[+T]:
  case Leaf(value: T)
  case Branch(left: BinaryTree[T], right: BinaryTree[T])
  case EmptyTree

object BinaryTree:
  given OptionalFilter[BinaryTree] with
    extension [T](tree: BinaryTree[T])
      override final def filter(predicate: T => Boolean): BinaryTree[T] =
        tree match
          case EmptyTree =>
            EmptyTree

          case Leaf(value) =>
            if predicate(value) then
              Leaf(value)
            else
              EmptyTree

          case Branch(left, right) =>
            (left.filter(predicate), right.filter(predicate)) match
              case (EmptyTree, EmptyTree) =>
                EmptyTree

              case (Leaf(value), EmptyTree) =>
                Leaf(value)

              case (EmptyTree, Leaf(value)) =>
                Leaf(value)

              case (left, right) =>
                Branch(left, right)
      end filter  

      override final def filterOption(predicate: T => Boolean): BinaryTree[Option[T]] =
        tree match
          case EmptyTree =>
            EmptyTree

          case Leaf(value) =>
            if predicate(value) then
              Leaf(Some(value))
            else
              Leaf(None)

          case Branch(left, right) =>
            Branch(
              left  = left.filterOption(predicate),
              right = right.filterOption(predicate)
            )
      end filterOption

    extension [T](tree: BinaryTree[Option[T]])
      override final def unNone: BinaryTree[T] =
        tree match
          case EmptyTree =>
            EmptyTree

          case Leaf(valueOption) =>
            valueOption match
              case None =>
                EmptyTree

              case Some(value) =>
                Leaf(value)

          case Branch(left, right) =>
            (left.unNone, right.unNone) match
              case (EmptyTree, EmptyTree) =>
                EmptyTree

              case (Leaf(value), EmptyTree) =>
                Leaf(value)

              case (EmptyTree, Leaf(value)) =>
                Leaf(value)

              case (left, right) =>
                Branch(left, right)
      end unNone
end BinaryTree
Haskell

{-# language BlockArguments, TemplateHaskell, QuasiQuotes #-}

module BinaryTrees where

import Filters

-- Required to generate default superclasses instances.
import Language.Haskell.TH.Instances

data BinaryTree t =
  Leaf { value :: t }
  | Branch { left :: BinaryTree t, right :: BinaryTree t }
  | EmptyTree
  deriving (Show)

$(return []) -- Required to make the BinaryTree type constructor in scope below.

-- Use the instances Quasiquoter to generate a default instance
-- of the Filter typeclass for the BinaryTree type,
-- given we provide an instance of the MaybeFilter typeclass
-- and this last one provides a default implementation of the filter function.
[instances| MaybeFilter BinaryTree where
  -- filterMaybe
  filterMaybe EmptyTree predicate           = EmptyTree
  filterMaybe (Leaf value) predicate
    | predicate value                       = Leaf (Just value)
    | otherwise                             = Leaf Nothing
  filterMaybe (Branch left right) predicate =
    let
      newLeft  = filterMaybe left predicate
      newRight = filterMaybe right predicate
    in Branch newLeft newRight

  -- unNothing
  unNothing EmptyTree           = EmptyTree
  unNothing (Leaf Nothing)      = EmptyTree
  unNothing (Leaf (Just value)) = Leaf value
  unNothing (Branch left right) =
    case
      (unNothing left, unNothing right)
    of
      (EmptyTree , EmptyTree)  -> EmptyTree
      (Leaf value, EmptyTree)  -> Leaf value
      (EmptyTree , Leaf value) -> Leaf value
      (left      , right)      -> Branch left right
|]
Interface usage - function definition

Scala

def onlyOdds[C[_]](c: C[Int])(using Filter[C]): C[Int] =
  c.filter(x => (x % 2) != 0)

def onlyOddsOption[C[_]](c: C[Int])(using OptionalFilter[C]): C[Option[Int]] =
  c.filterOption(x => (x % 2) != 0)
Haskell

import Filters

onlyOdds :: (Filter c, Integral t) => (c t) -> (c t)
onlyOdds ct = ct `Filters.filter` odd

onlyOddsMaybe :: (MaybeFilter c, Integral t) => (c t) -> (c (Maybe t))
onlyOddsMaybe ct = ct `Filters.filterMaybe` odd
Interface usage - function calling

For calling the function with an instance of any type,
we just need to supply, in the implicit scope,
an instance of our typeclass for our given type.
Given the rules, there are four places
on which such givens may be defined.

In the companion object of the typeclass.
In the companion object of the type.
In the scope where the function is being called.
Any other place, but has to be explicitly imported
in the scope where the function in being called.

Scala

// BinaryTree.
val tree: BinaryTree[Int] =
  Branch(
    left  = Branch(
      left = Leaf(value = 10),
      right = Leaf(value = 15)
    ),
    right = Branch(
      left = Leaf(value = 5),
      right = EmptyTree
    )
  )
// tree: BinaryTree[Int] = Branch(Branch(Leaf(10),Leaf(15)),Branch(Leaf(5),EmptyTree)

onlyOdds(tree)
// res: BinaryTree[Int] = Branch(Leaf(15),Leaf(5))

onlyOddsOption(tree)
// res: BinaryTree[Option[Int]] = Branch(Branch(Leaf(None),Leaf(Some(15))),Branch(Leaf(Some(5)),EmptyTree))

// List.
given OptionalFilter[List] with
  extension [T](list: List[T])
    // Override the default implementation of filter for performance improvements.
    override final def filter(predicate: T => Boolean): List[T] =
      list.filter(predicate) // Reuse the List's filter method.

    override final def filterOption(predicate: T => Boolean): List[Option[T]] =
      list.map(t => if (predicate(t)) Some(t) else None)

  extension [T](list: List[Option[T]])
    override final def unNone: List[T] =
      list.flatten // Flatten removes Nones and extracts the values from each Some.
end given

onlyOdds(List(10, 15, 20, 25))
// res: List[Int] = List(15, 25)

onlyOddsOption(List(10, 15, 20, 25))
// res: List[Option[Int]] List(None, Some(15), None, Some(25))
Haskell

import BinaryTrees

onlyOdds (Branch (Branch (Leaf 10) (Leaf 15)) (Branch (Leaf 5) (EmptyTree)))
-- Branch {left = Leaf {value = 15}, right = Leaf {value = 5}}

onlyOddsMaybe (Branch (Branch (Leaf 10) (Leaf 15)) (Branch (Leaf 5) (EmptyTree)))
-- Branch {
--   left = Branch {left = Leaf {value = Nothing}, right = Leaf {value = Just 15}},
--   right = Branch {left = Leaf {value = Just 5}, right = EmptyTree}
-- }
Takeaways


They provide a good separation of concerns,
which makes them flexible enough to make
the implementation of the interface for any type
(including externals) possible.


They are completely typesafe and all the "magic"
occurs on compile time, which makes them safe and faster.


The combination of the typeclass pattern,
together with the rich class pattern,
leads to very concise code.


The only downside we found with Typeclasses is that,
it is arguably the most complex solution.
And, that they require quite a lot of boilerplate code.


Final thoughts

Typeclasses seems like the best option
for modeling very general interfaces,
which can be implemented by many types,
given they flexibility and typesafety.
Subtyping is good for modeling behaviors which are common
to a concrete family of types. E.g. All Pets have a name.
Structural Types are a great demonstration of what the
Scala's type system is capable of, but does not seems
like a good tool for modeling interfaces.


The code in this document was compiled using:

Scala 3.3.3
OpenJDK 64-Bit Server GraalVM CE 21.0.2
Glasgow Haskell Compiler (GHC) 8.6.3, with the "intrinsic-superclasses"
package 0.4.0.0
Ubuntu Linux 22.04 (jammy)
Structural Types
Pros	Cons
Structural Types provide a simple way to write polymorphic functions that relay in a few `methods` which are common across many types.	However, they becomes inconvenient and troublesome when you depend on more than a couple of `methods`.
The example above showed that using structural types, we can have good return types.	However, it also showed that getting this can be a little bit complex. But, the main problem with this is that it is not completely correct, more precisely, there is no guarantee (at `interface` level) that the `C[_]` returned is the same type of the type on which the `method` was called.
Since in this case the `interface` only defines the names and signatures of a set of `methods` that must exist for a given type. It can be used with any external type, as long as such type defines those `methods`.	However, since they must exactly match the name and signature, it becomes really restrictive. For example, suppose there exits an external `PairList` type, which also defines a `filter`. But in this case. instead of taking a predicate of type `T => Boolean`, it takes one of type `(T1, T2) => Boolean`, given that we can not use it with our `IsFilterable` abstraction. Another problem with Structural Types it is that, there is no conscious process of implementing the `interface`.
The compiler guarantees that any value passed to the function has all the required `methods`. Thus, it is typesafe and will not fail in runtime.	However, it does perform reflection calls in runtime to execute the `methods` on the passed value. Thus, impacting performance and non-portable to other environments, like JS or Native.