Skip to content

Instantly share code, notes, and snippets.

@enzief
Last active April 13, 2020 20:18
Show Gist options
  • Save enzief/dbf3c0e72ef03860878b77203f62ce87 to your computer and use it in GitHub Desktop.
Save enzief/dbf3c0e72ef03860878b77203f62ce87 to your computer and use it in GitHub Desktop.

Scala effect systems

Table of Contents

  1. Programming with purity
  2. Scala's standard Future
  3. Superior alternatives
    1. cats-effect IO
    2. Monix Task
    3. Scalaz 8 Zio
    4. Features comparison
    5. Some bechmarks
  4. Interoperability
    1. With each other principled effect type
    2. With Future
  5. Conclusion

1. Programming with purity

In programming, a pure expression returns a value, and nothing more. That means all it does is the computation of the value, not performing any side-effectful activity. Pure expressions give rise to Referential Transparency and Equational Reasoning which are two important properties that we rely on to understand and manipulate programs.

  • If we assign an expression value to a name, the Referential Transparency property allows us to substitute the name with the original expression (and vice versa) at the callsite without any consequence on correctness. Therefore, operations from Java, or from Scala own standard library, are not referentially transparent, including UUID.randomUUID(), object field setters, array, println, mutable collections, etc..

  • Referential Transparency enables Equational Reasoning whereby we reason about our program logic using simple substitutions of names and the expressions they were assigned with. This makes it easier to understand what the programs do, and allow them to be refactored/simplified safely without us worrying that their behavior will change.

2. Scala's standard Future

When we use Future, the potential for problems associated with both its eager evaluation, memoization nature, as well as its lack of Referential Transparency that makes the program difficult to reason about and diagnose errors.

With Future, the first of the big difficulties is a lack of Referential Transparency. For example, consider the two following programs that behave differently:

// (1)
val f: Future[Int] = Future { 1 }
val g: Future[Int] = Future { 2 }
for {
  i <- f
  j <- g
} yield i + j

// (2)
for {
  i <- Future { 1 }
  j <- Future { 2 }
} yield i + j

In program (1), the two Future start at the same time, run in parallel, multi-threaded fashion. On the opposite, the two Futures in program (2) run sequentially, one has to wait for the other to complete before starting running.

Secondly, Future is eagerly evaluated, meaning that when the Future object is instantiated, it starts running in a different thread. This behavior is itself a side-effect, further destroying Referential Transparency and Equational Reasoning. In addition, Future memoizes its result so that the effect captured by a Future instance is not reusable.

var i: Int = 0
val icr: Future[Int] = Future { i += 1; i }

print(i) // prints 0 or 1, who knows ¯\_(ツ)_/¯

for {
  a <- icr
  _ =  i = 2
  b <- icr // (3)
} yield b

In the program above, after defining icr we have to keep in mind that something is running in parallel with the main program flow, that may or may not change the outcome of instructions. At this point the main propram becomes impossible to reason about without actually running it.

From a different aspect, we write (3) hoping that it will change i to contribute to the for's final result of 3. That expectation is incorrect since icr memoizes its result and will never run again so that (3) purely returns 1.

Thirdly, multiple benchmarks have tested both deeply nested flatMaps, as well as chained flatMaps, and Future comes in last place every single time. It is partly due to the fact that Future relies on context switching on most of its operations. Future, when map'ing, flatMap'ing, or onComplete'ing, causes a context switch to occur (see here). When a context switch happens, CPU resource is switched from one thread to another, the current job's progression is suspended, the current state(intermediate computation results, callstack, etc.) in CPU registers is replaced with the new job's. Such switch is incredibly costly due to not only the switching work itself but also the wasted CPU cycles not doing real tasks. To get around that, there is an attempt from Lightbend Akka team by alternatively implementing FastFuture, a trampolining Future with its own execution context.

Additionally, using Future requires introducing ExecutionContext, causing a change in semantics depending on which one is used. It also pollutes the API and forces one to pass them at strange points in the code if not careful.

3. Superior alternatives

There are many alternatives in the current Scala ecosystem for Future. Here we only consider the 3 most famous ones: cats-effect IO, Monix Task, and Scalaz 8 IO (from here named Zio), for each is backed by its own strong community and ecosystem. Scalaz 7's Task is not considered because of its well-known problem of performance and wrongly thread switching.

cats-effect IO, Monix Task, and Scalaz 8 Zio are monads, meaning that they can be sequenced and composed lawfully. They are lazily evaluated, thus can be passed around, transformed, and reused at ease. Although they all support lifting asynchronous computations, their nominal behaviour is to execute synchronously, preventing un-necessary context switching. Unless asynchronosity is needed, in which case an explicit asynchronous call is made. This affords one all kinds of transformation-related performance benefits, as well as a semantic that is easily understandable and helps eliminate unknowable unknowns in the program because of the equationally reasoned architecture.

Monix Task offers flexibility in execution strategies with its Scheduler, which will be later explained further in its own section. On the other hand, cats-effect IO and Zio utilize Fiber-based concurrency model to abstract away from JVM/OS threads. IO and Zio typically run on top of Fibers - lightweight logical "threads" of execution that has two main capapilities fork and join. Unlike JVM threads, Fibers are implemented at user-land library layer, asking users for very low cost to create and transform them. Fibers are also mere immutable data structures so that adding concurrency features such as interruptability or cancelability is at zero cost. The difficult and heavy work of interpreting such data structures are maintained by their maintainers. Resource is guaranteed to be handled safely in a non-leaking way on termination of the Fiber (either interrupted, or completed).

3.1 cats-effect IO

Website

cats-effect is intended to be the base center of FP Scala effect system. It is deeply integrated with Typelevel ecosystem (cats-core, fs2, http4s, etc.). As a part of cats-effect, cats-effect IO is implemented with the intention to provide a Scala counterpart of Haskell's IO, and a user default fallback effect type.

Compared to others FP effect types (Monix Task, Zio), cats-effect IO is, by original design, rather simple and minimalistic. An IO-based user program is a pure program to create an immutable object IO that contains multiple side-effectful operations defining interactions with outside world, which makes the program useful.

cats-effect IO has a built-in error handling mechanism. An IO[A] when run might eventually result in a single value of type A or an erroneous Throwable. In case the program needs custom error type E, IO has to be used together with an error handling monad, say Either[E, ?], so that the IO program now computes an IO[Either[E, A]].

3.2 Monix Task

Website

Monix is not a pure functional programming (FP) library. Unlike the IOs, Monix assumes users continuing having control of program execution after "the end of the world" (when users actually run the effectful program contained inside IO or Task). Therefore, in addtion to functional API for FP usage, Monix offers another reactive API with abstraction like Observable, Observer, and Subscriber.

Monix exists before cats-effect. Monix' Task execution model requires a Scheduler which can be built from ExecutionContext. Through the Scheduler, user can choose among the provided execution models: batched, always asynchronous, and synchronous, or create another one themselves.

Monix Task comes with various pure and impure operations to lift/unlift from imperative codes. This design decision results in flexible support for both of the programming paradigms, imperative and purely functional. Monix is more pragmatic in that sense but less principled than cats-effect and Zio. It provides constructs that can prove useful to operate in conjunction with impure libraries/frameworks, at the cost of sometimes violating referential transparency, which the user should be aware of.

3.3 Scalaz 8 Zio

Website

Zio is a completely new effect type of Scalaz 8's effect system, and has no relation with Scalaz 7's Task. It is under heavily active developement.

Zio has flexible error handling baked into its signature as Zio[E, A] in which E is the error type that the running job can fail with. This flexibility allows Zio users to specify an infallable job as Zio[Nothing, A], or an always-failed job as Zio[E, Nothing] with a more descriptive type signature. Since the other execution monads (including Future, cats IO, Monix Task) have fixed their error type to Throwable, to enable the same flexibility they'll have to utilize MTL structures such as EitherT. Compared to cats- effect's equivalent EitherT[IO, E, A], Zio's Zio[E, A] requires less boilerplate and helps us save one pair of box/unbox for every transformation.

The drawback of Zio is that it is under 1.0 and not yet has its own ecosystem. In comparison to its rivals, Zio has a more modest collection of documentations.

3.4 Features comparison

IO Task Zio Future
Referential transparent Yes Depends Yes No
Transform without ctx switching Yes Yes Yes No
Effect suspension Yes Yes Yes No
Cancelability Yes Yes Yes No
Uninteruptability No No Explicit Default
Race Yes Yes Yes No
Control of execution model No Yes No No
Scheduling Yes Yes Yes No
Parallelization Explicit Explicit Explicit Default
Resource safety Yes Yes Yes No
Repeat - Retry No Yes Yes No
Supervision* No No Yes No
Memoization No Explicit No Default
Observability** No Yes No No

* Supervision: on termination of the job, interrupt all jobs forked by it. Implemented in Zio

** Observability: support multiple event consumers. This is not Task's out of the box, but it can be converted into built-in Observable

3.5 Some bechmarks

https://github.com/fosskers/scalaz-and-cats#benchmarks

https://alexn.org/blog/2016/08/25/monix-task-performance.html

https://twitter.com/jdegoes/status/924992350849552384/photo/1

4. Interoperability

4.1. With each other principled effect type

In FairPlay, we have decided to utilize cats-effect IO. Despite that, interoperability is a non-issue for FairPlay since we are parameterized on the execution abstraction, centralizing cats-effect' Effect monad which requires the chosen Effect execution type in use is lawful and capable. An Effect type is a monadic type that can suspend the side-effect into a context commonly named F. The resulting F can be later evaluated to run the side-effect it contains in a lazy, potentially asynchronous, fashion. cats-effect IO, Monix Task, and Zio are all valid Effect types and they are interchangeable.

4.2. With Future

Since mDRM services are currently built on top of Play framework for which Lightbend dictates the support for its Future, the effect type chosen should provide Future-interoperative features.

Fortunately, the effect types considered all have built-in support for going back and forth from it. Furthermore, we have already implemented fromTheFuture function. It was used to incur the 1-time penalty of a context switch from Future back into an F (can be either IO/Task/Zio of choice), so that we wouldn't incur many more when we transform the data received for use. The async calls are isolated to a specific portion of the program and limit context switching at every juncture possible. This allowed us to push all the side-effectful portions of the program to very specific, well-managed, isolated places and we've seen the performance metrics that prove this tactic is beneficial. Moreover, the original Future comes from the use of Play's HTTP client which utilizes Future as an abstraction to describe asynchronous HTTP calls. Thus, the context switching from Future to F can be avoided by making use of one of other libraries (including sttp and http4s-client) which leaves to the user the responsibility to choose which effect they prefer working with.

5. Conclusion

Future is hard to use and is inadequate as an effect system to build pure, correct, concise, and maintainable asynchronous programs. As of for now, cats-effect IO would be the choice for its maturity and rich ecosystem. On the other hand, with the rise of Scalaz 8, Zio will also be a great choice when it gets to production-ready stage with community support and a solid number of well-maintained libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment