Skip to content

Instantly share code, notes, and snippets.

@EugeneN
Forked from mbbx6spp/BLOGPOST0.adoc
Created February 14, 2016 09:09
Show Gist options
  • Save EugeneN/6b147a2a9b4d2113f321 to your computer and use it in GitHub Desktop.
Save EugeneN/6b147a2a9b4d2113f321 to your computer and use it in GitHub Desktop.
Demonstration of referentially transparent state without the extra client boilerplate from a live coding study session on the FP in Scala book with coworkers. (Chapter 6 in FP in Scala)

Functional State: From Object Oriented Domain Modeling to Functional Programming Algebras

When coming from an object oriented background to functional programming you might be wondering if and/or how functional programming allows for modeling abstractions in our domain and how we go from an abstract model to working code.

Here is my attempt to provide a path so you can update your thought processes to aid in the transition from OO to FP.

OO Domain Model

We will start by revisiting what a domain is in the first place, which I will just define simply as the scope of the problem space we are interested in. For example, an online flight booking system might include the following considerations:

  • flight routes

  • number of seats available on aircraft used on each leg of the route

  • flight times/schedules

  • costs associated with each leg

  • frequent flyer qualification and redemption

It would probably not be concerned with:

  • passenger luggage policy

  • real-time gate information

  • flight crew schedules

  • flight crew legal work restrictions

When we are building an online flight booking service we would not need to model this last list of concerns as none of them inform how to effectively build such a service. An application that monitors real-time gate information such that it can provide more seamless connections for frequent flyers or passengers requiring additional assistance to make tight connections may need to query bookings from the ticketing system which our online flight reservation system will eventually feed into as well as real-time gate information but that is a separate application with a distinct domain and scope.

Now to define domain model. A definition I like used to like is from Fowler’s Patterns of Enterprise Application Architecture book for it’s simplicity:

An object model of the domain that incorporates both behavior and data.

The purpose of a domain model is to allow you to think through the problem space (or domain) and define semantics for a consistent view of the problem space. It allows you to communicate this shared understanding across the team and it can also help guide you toward better designs and find inconsistencies in your terminology usage or understanding on the team.

One component that seems missing from Fowler’s definition above is that in addition to behavior and data (structure) we should also be defining constraints (preconditions, postconditions, and/or invariants) on both behaviors and our data (structures). This gives us a richer understanding of what should and should not be possible in our domain which gives us a way to validate the values in our application more neatly.

In summary, a domain model let’s us define:

  • behavior

  • data structure (or types)

  • constraints or rules

Now we will see how we can map these ideas to the modeling environment and language functional programmers tend to use: mathematics.

FP Algebras

Wait. Don’t run in fear, at least hear me out first.

We saw in the previous subsection that in richer OO domain models we end up with three basic kinds of definitions: behavior, data (structure), and constraints/rules.

In the FP world we translate these components as so:

  • beahvior ⇒ functions

  • data ⇒ types

  • constraints/rules ⇒ laws (or informally "properties")

In typed FP we typically approximate the notion of types to the concept of sets in the context of abstract algebra. For example, the set of natural numbers is the type and the elements of that set are the possible values of that type.

Algebra allows us to be more precise in our definitions and the act of abstraction means we focus on just the important parts to the problem space we are looking to solve directly.

Functional State: Evolution from implicit state to explicit state with more usable API

We ended up with what is in kvscala.scala. This post will walk through each refactoring state we took starting from the following interface that provides an implicit state interfact:

      trait KVCache[K, V] {
        def store(k: K, v: V): Unit
        def lookup(k: K): Option[V]
        def invalidate(k: K): Unit
      }

The expectation of this interface is that we modify the internal state of the implementation instance of KVCache[K, V] (we use InMemoryCache in our examples (see kvcache.scala source file). When the cache value is accessed by multiple threads we require internal coordination via error prone locks to provide correctness. This incurs the cost of duplicated lock logic in both mutating operation implementations in every implementation of KVCache[K, V]. We also make testing our cache semantics hard to assert as tests rely on knowing the order of the operations to determine resulting state. When the coordination is done at the KVCache operation implementation level we cannot even specify ordered sequences of computations in chunks against the key value cache.

We are lazy and we want everything for free. And we think we can do better.

Implicit mutable state is not ideal because we end up sharing it in our usage across threads. Either we end up with contention due to low level coordination/locks resulting in unpredictable run-time characteristics or incorrect views of the underlying data.

The client (consumer of this API) might do something like this:

    // Types for the example. And yes, I might be obsessed with season one right now.
    case class Character(firstName: String, lastName: String, nickName: Option[String])
    sealed trait CellBlock
    case object TheSuburbs extends CellBlock
    case object TheGhetto extends CellBlock
    case object SpanishHarlem extends CellBlock
    case object GoldenGirls extends CellBlock
    case object Others extends CellBlock
    case class Cell(left: Character, right: Character)

    // Setup values to make the usage cleaner
    val piper = Character("Piper", "Chapman", Some("Pipes"))
    val alex = Character("Alex", "Vause", None)
    val missclaudette = Character("Claudette", "Pelage", Some("Miss Claudette"))
    val leanne = Character("Leanne", "Taylor", None)

    // Actualy usage of the API
    val c: KVCache[CellBlock, List[Cell]] = InMemoryCache()
    c.store(TheGhetto, List(Cell(missclaudette, piper)))
    c.store(TheSuburbs, List(Cell(leanne, alex))

Note the usability of the API is not a problem in the above implicit state model. We aren’t juggling part of the result from step(N) and threading it into the input of step(N+1). When we were writing our minimal non-OITNB example in kvcache.scala we made numerous mistakes in the second explicit state API design we came up with next. Demonstrating its error-prone-ness.

The study group translated the above implicit interface and exposed explicit state in the result types and we ended this phase of our study session with code that could support the following:

      trait KVCache[K, V] {
        def store(k: K, v: V): KVCache[K, V]
        // Read-only so we can reuse +KVCache[K, V]+ we used to lookup for next query.
        // Therefore, we don't need to return the new cache as it stays the same
        def lookup(k: K): Option[V]
        def invalidate(k: K): KVCache[K, V]
      }
      // This is referentially transparent because we could substitute this code:
      object Main extends App {
        val c: KVCache[String, String] = InMemoryCache()
        c.store("index.body.html", "<p>Welcome to Litchfield Penitentiary.</p>")
        c.store("about.body.html", """
<p>This penitentiary houses a minimum- and a maximum-security prison located in upstate New York.</p>
<p>The minimum-security prison is comprised of five cell blocks, a Security Housing Unit (SHU), and a psychiatric ward.</p>
<p>The maximum-security prison (MAX) is for inmates with more serious or violent offenses.</p>
        """)

        def main1 = {
          val c2 = c.invalidate("about.body.html")
          val v1 = c2.lookup("contact.body.html")

          val c3 = c2.invalidate("index.body.html")
          val v2 = c3.lookup("contact.body.html")

          (v1, v2)
       }

        def main2 = {
          def g(a: String, b: String) = {
            val c2 = c.invalidate(a)
            c2.lookup(b)
          }
          val v1 = g("about.body.html", "contact.body.html")
          val v2 = g("index.body.html", "contact.body.html")

          (v1, v2)
        }

        println(main1 == main2) // should print true
      }

This interface gave us referential transparency (see Main above), however, the consuming code then became error prone and could not easily be used concurrently without a lot of coordination mechanics above the KVCache[K, V] interface in the consumer/client code of the interface, which would also be highly error prone (perhaps worse?) and/or would cause throughput bottlenecks on store and invalidate operations.

As reference this is what code that consumes this API would look like (and as an anecdote, we kept making logical errors to get it to run with the expected results):

    // I bet you can't tell what TV series I just started watching.
    val oitnb1 = InMemory[String, String]()
    val oitnb2 = oitnb1.store("larry", "piper")
    val oitnb3 = oitnb2.store("alex", "piper")
    val oitnb4 = oitnb3.invalidate("larry")
    val oitnb5 = oitnb4.invalidate("alex")
    // what next??
    // I have just finished season one, so no spoilers, this is basically where it is at.

Can we do better?

What do you think?

So what did we do in the third step was try to keep the first set of arguments the same as the previous APIs and returned a function that accepts the state we need to operate on and returns a tuple with the desired output result and the new state. We then lifted that function into a simple type so we could run the function given the state to act upon.

Let us look at the type signature changes so far:

    // original definition
    //def store(k: K, v: V): Unit
    // explicit state iteration which assumes the instance
    // this works on contains internal immutable state we can
    // access to append the new key-value pair to return
    // the new cache value back to consumer
    //def store(k: K, v: V): KVCache[K, V]
    // next step: take a leap of faith and eliminate the internal state of
    // the +KVCache[K, V]+ instance
    //def store(k: K, v: V): Map[K, V] => (Unit, Map[K, V])
    // by creating a data constructor around this return type signature we have
    case class State[S, A](run: S => (S, A))
    def store(k: K, v: V): State[Map[K, V], Unit]

So now we have something we can keep injecting the underlying state context into and receiving both the new state and the result of the specific operation. By itself this is still not giving us the consuming API we want to use, so what next?

In Scala we can use the for-comprehension to describe a sequence of effectful (in our case we are potentially modifying the internal representation of the cache when we execute our operations defined for our domain (store, lookup, invalidate). Something like the following might be a way to express a sequence of KVCache operations without the error prone boilerplate in the second step usage example:

    case class Actor(name: String, imdbId: String)
    case class Character(firstName: String, lastName: String, nickName: Option[String])

    val lauraPrepon = Actor("Laura Prepon", "nm0696059")
    val taylorSchilling = Actor("Taylor Schilling", "nm2279940")
    val laverneCox = Actor("Laverne Cox", "nm1209545")
    val uzoAduba = Actor("Uzo Aduba", "nm2499064")
    val danielleBrooks = Actor("Danielle Brooks", "nm5335029")
    val madelineBrewer = Actor("Madeline Brewer", "nm5483400")

    def setupCast(c: KVCache[Character, Actor]) = for {
      _ <- c.store(Character("Alex", "Vause", None), lauraPrepon)
      _ <- c.store(Character("Piper", "Chapman", Some("Pipes"), taylorSchilling)
      _ <- c.store(Character("Sophia", "Burset", None), laverneCox)
      _ <- c.store(Character("Suzanne", "Warren", Some("Crazy Eyes"), uzoAduba)
      _ <- c.store(Character("Tasha", "Jefferson", Some("Tastee"), danielleBrooks)
      _ <- c.store(Character("Trisha", "Miller", None), madelineBrewer)
    }

    def killCharacters(c: KVCache[Actor, Character]) = for {
      _ <- c.invalidate(madelineBrewer)
    }

    def displayCredits(c: KVCache[Actor, Character]) = sys.error("todo")

    def showCreditsBeforeAndAfter(c: KVCache[Actor, Character]) = for {
      _ <- setupCast(c)
      _ <- displayCredits(c)
      _ <- killCharacters(c)
      _ <- displayCredents(c)
    } yield ()

    val c: KVCache[Actor, Character] = InMemoryCache()
    showCreditsBeforeAndAfter(c)

To be able to use values of type KVCache[K, V] in a for-comprehension we need to define a map and a flatMap function with the following types:

  case class State[S, A](run: S => (A, S)) {

    // fmap :: m a -> (a -> b) -> m b
    // part of the functor interface
    def map[B](f: A => B): State[S, B] = State(
      s0 => {
        val (a, s1) = run(s0)
        (f(a), s1) // (s1, f(a))
      }
    )

    // bind :: m a -> (a -> m b) -> m b
    // part of the monad interface
    // State[S, A] => (A => State[S, B]) => State[S, B]
    def flatMap[B](f: A => State[S, B]): State[S, B] = State(
      (s0: S /*State */)  => {
        val (a, s1) = run(s0)
        f(a).run(s1)
      }
    )
  }

It turns out the implementation of flatMap basically contains the boilerplate code we needed to remove from the client code of our previous round’s explicit state design.

Below we can see how we might use operations that return a State[S, A] in a sequential computation:

  val season1Map: Map[String, String] = Map(
    "privileged" -> "Piper"
  , "brooding"   -> "Alex"
  , "scary"      -> "Red"
  , "funny"      -> "Tastee"
  , "stylish"    -> "Sophia"
  , "crazy"      -> "Suzanne"
  , "pregnant"   -> "Dayanara"
  )
  val season2Map: Map[String, String] = Map(
    "scary"      -> "Vee"
  , "privileged" -> "Soso"
  , "funny"      -> "Nicky"
  ) // I am only half way through

  val season3Map: Map[String, String] = Map.empty[String, String] // not watched yet

  val happyLookup: State[Map[String, String], String] = for {
    v <- c.lookup("happy")
  } yield v.getOrElse("default")

  println(happyLookup.run(season1Map)) // (Tastee, ...)
  println(happyLookup.run(season2Map)) // (default, ...)
  println(happyLookup.run(season3Map)) // (default, Map())

  def set(c: KVCache[String, String])(k: String, v: String) =
    for {
      _ <- c.store(k, v)
    } yieldc

  def expire(c: KVCache[String, String])(k: String, v: String) =
    for {
      v <- c.lookup(k)
      _ <- c.invalidate(k)
    } yield (v, c)

  val setter = set(c)
  println(setter("brooding", "Piper").run(season2Map)) // (..., ...)

  val expirer = expire(c)
  println(expirer("privileged").run(season1Map)) // ((Piper, ...), ...)

We are basically writing small programs that we can run later given some initial state. We can then reuse these small programs using different initial state (note here our initial state changes from season to season).

The for-comprehension syntax is basically syntactic sugar and it desugars like the following:

  def expire(c: KVCache[String, String])(k: String, v: String) =
    for {
      v <- c.lookup(k)
      _ <- c.invalidate(k)
    } yield (v, c)

  def expireDesugared(c: KVCache[String, String])(k: String, v: String) =
    c.lookup(k) flatMap {
      (ov: Option[String]) => c.invalidate(k) flatMap {
        (_: Unit) => State(s => ((), s)).map(_ => (ov, c))
      }
    }

Another observation is that by using map2, sequence, both, mapM_, etc we can combine and with map we can transform the underlying encapsulated result value without lots of boilerplate code to unwind the value from the container and wrap it up again if/when we need to do something with the wrapped result.

We didn’t have time this session to implement all the above operations on State[S, A], but hopefully we can see how the following type signatures would be useful:

  // inside case class State[S, A](run: S => (A, S)) {

  // full type: State[S, A] => List[State[S, A]] => State[S, List[A]]
  // Use when we want to transform a List of states into a State where
  // result type is a List.
  // For example, I could imagine needing to go from a `List[State[S, Character]]`
  // to a `State[S, List[Character]]` in some situations where we can use the
  // new value inside a for-comprehension.
  def sequence[A](fs: List[State[S, A]]): State[S, List[A]] = sys.error("TODO")

  // full type: State[S, A] => State[S, B] => ((A, B) => C) => State[S, C]
  // I could see a use for this when parsing disjoint CSV files where we
  // pull data from one and data from the matching line in another and
  // construct a value representing both peices of data. e.g. actor names
  // and bios.
  def map2[B,C](sb: State[S, B])(f: (A, B) => C): State[S, C] = sys.error("TODO")

  // full type: State[S, A] => State[S, B] => State[S, (A, B)]
  def both[A,B](rb: State[S, B]): State[S, (A, B)] = sys.error("TODO")

  //}

  // inside the companion State object since instance data isn't needed.
  // full type: (A => State[S, B]) => List[A] => State[S, Unit]
  // Use when we want to do some *action* on the underlying values.
  def mapM_(f: A => State[S, B])(xs: List[A]): State[S, Unit] = sys.error("TODO")

We didn’t get to tackle the problem of concurrent access to KVCache[S, A] directly yet but by walking through this thought exercise with the Scala compiler we get a better sense that by changing the API to have properties that allow us to reason about our code more equationally we can build on top of this more solid foundation to more adquately express and solve the problem at hand.

Can you imagine some combinators we could define to even combine these already quite powerful operations together?

As an aside I noticed the parallel to mealy machine model where I speced out the interface for it in mealy.scala. After we derive implementations for sequence and a few other operators on State[S, A] we might look at fitting State into the Mealy interface which appears a little more generic but supports the basic semantics.

/*
Code we ended the session with which compiles and runs the less fun examples.
*/
object Cache {
trait KVCache[K, V] {
/** operators on the +KVCache+ */
def store(k: K, v: V): State[Map[K, V], Unit]
def lookup(k: K): State[Map[K, V], Option[V]]
def invalidate(k: K): State[Map[K, V], Unit]
}
case class InMemoryCache[K, V]() extends KVCache[K, V] {
def store(k: K, v: V): State[Map[K, V], Unit] = State(s => ((), s ++ Map(k -> v)))
def lookup(k: K): State[Map[K, V], Option[V]] = State(s => (s.get(k), s))
def invalidate(k: K): State[Map[K, V], Unit] = State(s => ((), s - k))
}
case class State[S, A](run: S => (A, S)) {
// fmap :: m a -> (a -> b) -> m b
// part of the functor interface
def map[B](f: A => B): State[S, B] = State(
s0 => {
val (a, s1) = run(s0)
(f(a), s1) // (s1, f(a))
}
)
// bind :: m a -> (a -> m b) -> m b
// part of the monad interface
// State[S, A] => (A => State[S, B]) => State[S, B]
def flatMap[B](f: A => State[S, B]): State[S, B] = State(
(s0: S /*State */) => {
val (a, s1) = run(s0)
f(a).run(s1)
}
)
/* state combinators */
// state combinators
// m a -> m b -> ((a, b) -> c) -> m c
def map2[B,C](sb: State[S, B])(f: (A, B) => C): State[S, C] =
State(
s0 => {
val (a, s1) = run(s0)
val (b, s2) = sb.run(s1)
(f(a, b), s2)
}
)
}
object State {
def unit[S, A](a: A): State[S, A] = State(s => (a, s))
}
}
object Solution extends App {
import Cache._
val c: KVCache[String, String] = InMemoryCache()
val program1: State[Map[String, String], String] = for {
_ <- c.store("awesome", "Peter")
_ <- c.store("curious", "Mahesh")
v <- c.lookup("awesome")
} yield v.getOrElse("default")
val program2: State[Map[String, String], String] = for {
_ <- c.store("awesome", "Peter")
v <- c.lookup("awkward")
} yield v.getOrElse("default")
val result1: (String, Map[String, String]) = program1.run(Map.empty[String, String])
println(result1) // (Peter,Map(awesome -> Peter))
val result2: (String, Map[String, String]) = program2.run(Map.empty[String, String])
println(result2) // (default,Map(awesome -> Peter))
val result3: (String, Map[String, String]) = program2.run(Map("awkward" -> "Susan"))
println(result3) // (Susan,Map(awkward -> Susan, awesome -> Peter))
// this is how a for-comprehension in result1 is desugared into flatMaps and maps:
val result4 = c.store("awesome", "Peter").flatMap { /* (a -> m b) => m b */
(_: Unit) =>
c.lookup("awesome").flatMap { /* (a -> m b) => m b */
(a1: Option[String]) =>
State(s => (a1, s)).map(_.getOrElse("default"))
}
}.run(Map.empty[String, String])
println(result4) // (Peter,Map(awesome -> Peter))
}
// This is a Mealy machine based off of a mathematical definition of it from here:
// LINK
object mealy {
/**
A mealy machine is a model of a finite state machine with the following
elements:
* a finite set of states of type S
* an initial state value of type S
* a finite set of input values (type I)
* a finite set of output values (type O)
* a transition function, transition, from the product type (S, I) to S.
* an output function, output, from product type (S, I) to O.
We could combine the last two elements into the function with a function of
type signature: (S, I) => (S, O).
Mealy machines have the following tradeoffs as compared to Moore Machines
which is another common FSM model:
* pro: mealy machines typically have fewer states
* con: connecting mealy machines may cause unintended consequences
There exists an equivalence relation between Mealy and Moore machines which
means given a Mealy machine we can construct a Moore machine from it.
*/
trait Mealy[S, I, O] {
def init: S
def transition: S => (I => S)
def output: S => (I => O)
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment