scottfrazer/goscala.md

## goscala.md

      
    Raw
  

              goscala.md
            
          
    Go vs. Scala (Akka) Concurrency

A comparison from 2 weeks using Go.
Actors vs. Functions

Akka's central principle is that there you have an ActorSystem which runs Actors.  An Actor is defined as a class and it has a method to receive messages.
class MyActor extends Actor {
  override def receive = {
    case MyFirstMessage(a, b, c) => ???
    case MySecondMessage(d) => ???
  }
}
Go doesn't have classes.  Go's primary unit of concurrency is the goroutine.  A goroutine is simply an invocation of a function using the go  keyword.  The equivalent code in Go would be:
func MyActor(first chan string, second chan string) {
  for {
    select {
    case x := <-first: // do stuff
    case x := <-second: // do other stuff
    }
  }
}
This function can now be called synchronously or asynchronously:
go MyActor(chan1, chan2) // spawn goroutine and continue
MyActor(chan1, chan2) // normal function call
Receive method vs. select statement

Akka has a receive method that must be overridden and implemented for all actors.  This method defines what kind of messages that actor can receive and what to do when it receives each kind of message.
This can be confusing sometimes because Actors are also classes so they are subjected to inheretance and being mixed in with traits.
This can lead to situations where the receive method for an actor is actually implemented on the grandparent in the class-inheretance heirarchy.
Also, actors can only have one receive method and it's implied that the actor is constantly receiving and processing messages from its mailbox until it crashes or shuts down.  There's no built-in way that I could find to have one actor receive two kinds of messages depending on a condition.  This would have to be done with multiple actors.
Go, on the other hand, has a select statement, which is incredibly versatile way to select from multiple channels.  A function can use a select statement exactly as they would any other statement.  A function can have lots of select statements or have a select statement inside of a conditional.  Or, like in the example below, use a select statement within loops to continuously read ten elements at a time of a channel:
func MyFunc2(work chan string) {
        tenElements := make([]string, 10)

        for {
                for index := 0; index < 10; index++ {
                        select {
                        case w := <-work:
                                tenElements[index] = w
                        }
                }
                fmt.Println(tenElements)
        }
}
"blocking"

In Akka/Scala land there is without a doubt a deep seated fear of "blocking".  A thread that is hanging out not using the CPU while work piles up behind it is wasted computing power!  And so we all started to fight the battle to utilize our CPUs 100%.
In Go, I was kinda shocked to see so much of the documentation refer to operations that block.  The select statement can block, reading from a channel can block, writing to a channel can block...  What's the deal?  Surely this would never fly in Scala
It was then that I realized that I don't really understand blocking.  I'm not really sure why Thread.sleep() in Akka-land is a no-no.  I always thought this was a bit strange.
Then I met Go and I realized that it was a bit strange.  The following Go code works exactly how you'd expect it to:
import (
    "time"
    "fmt"
    "sync"
)

func f(wg *sync.WaitGroup) {
  defer wg.Done()
  time.Sleep(time.Second * 5)
}

func main() {
  var wg sync.WaitGroup
  fmt.Println("starting goroutines...")
  for i:= 0; i < 10000; i++ {
    wg.Add(1)
    go f(&wg)
  }
  fmt.Println("waiting for goroutines to exit...")
  wg.Wait()
  fmt.Println("done")
}
10,000 goroutines are spawned, all of them sleep for 5 seconds, then they exit.  When I time the execution of this code, this is what I get:
real    0m5.030s
user    0m0.079s
sys     0m0.033s

In Go, "blocking" isn't really the same as it is in Akka.  Most operations "block" in a way that yields control of the goroutine.
channels vs. messages

Channels are a genius idea.  In Go, a channel is a language primitive and it represents a communication of data structures between two goroutines.
In Akka, you think in terms of messages which are usually structured as commands send to an actor.  An actor has a mailbox which starts filling up with messages and then your actor starts processing them one at a time, single threaded.  The tendancy with actors is to think of messages like commands... like class PersistNewData(data: MyData, unixTime: int).  Actors don't have to be thought of this way, this was my experience with Akka.
Channels operate more on a publish-subscribe model.  You write to a channel and another receiver has to read from that channel.  Channels are incredibly versatile.  They can be used as semaphores, they can be used for load balancing and back pressure, they can be attached to structures, they can be returned from functions.  They're really amazing.
Here's an example of how one could use a channel to implement a pool of worker threads
package main

import "time"
import "fmt"

func worker(work chan int) {
        for x := range work {
                fmt.Printf("processed %d\n", x)
        }
}

func main() {
        work := make(chan int)
        go worker(work)
        go worker(work)

        go func() {
                work <- 5
                work <- 6
                work <- 7
        }()

        time.Sleep(time.Second * 5)
}
State data vs. the stack

Akka Actors inevitably become full state machines, because it is useful for an actor to store data.  Akka FSMs have made a way to keep all the data immutable, but I find it pretty confusing to manage state data.
Go simplifies this greatly:  There's just a regular no-frills function stack.  You want state data?  create some variables at the top of your function.  Go also is okay with having mutable data structures, which fascilitates this much better.
"let it crash" vs. return statement

Want an actor to exit?  Well, you could "let it crash"... you could also do context.stop(self) during the processing of a message.  But there's also a whole host of supervision strategies that kicks in when something crashes which you need to learn and configure.
Go's philosophy is that a goroutine is done when it returns.  Plain and simple.  Hit an error condition?  log it, and return.  This is fascilitated by the defer statement, which will always run when a function returns:
func f() {
  fmt.Println("START")
  defer fmt.Println("DONE")
  
  err := g()
  if (err) {
    fmt.Println("ERROR")
    return
  }
  
  fmt.Println("SUCCESS")
}
This function will always print START first and DONE last and either ERROR or SUCCESS in the middle.
This defer statement can be used to do clean-up, unlock locks, end transactions, notify wait groups, etc.
Await.result()

Don't even get me started on Await.result().  It's so tantilizing... you spin off a Future and you want to continue when that Future returns, so you're tempted to write:
val result = Await.result(future, timeout.duration).asInstanceOf[String]
But then things slow down and you wonder why, and then you figure out that Await.result actually BLOCKS THE THREAD AND IS DISCOURAGED!!

This will cause the current thread to block and wait for the Actor to 'complete' the Future with it's reply. Blocking is discouraged though as it will cause performance problems.

This is perhaps my biggest gripe with Akka.
Go does not have this problem.  Blocking on a channel read does not waste an OS thread.  In go, you can do this and wait for a response without blocking the thread:
future := DoSomethingReturnChannel()
result := <-future
ActorContext and ActorSystem vs. goroutines

This is one area where it really feels like concurrency is stapled on after-the-fact in Scala.  Akka makes you set up an ActorSystem which has an ExecutionContext.  Then you have to ask the ActorSystem or ActorContext to create an actor with certain parameters.  The Actors themselves are usually passed around as ActorRef which behind it holds the REAL actor.  The mailbox lives on the ActorRef?  I don't even remember...  Sometimes you have situations where you have multiple actor systems but not deliberately.  Or sometimes the ActorRef is alive and well but the Actor behind it isn't.  The ExecutionContext has its own set of parameters and interesting tidbits that you need to learn about when writing with Akka.  Actors have string paths so you can look up singletons and apparently you should NAME ALL OF YOUR ACTOR INSTANCES?!
Akka also has a tendency to want everything in your system to be an Actor, which I'm not really that fond of.
Go feels so much simpler.  None of these concepts even exist in Go!  You simply spawn off goroutines when you need them and everything just seems to work.  Any modern computer I imagine will be able to handle probably thousands and hundreds of thousands of goroutines.  Any server computer will likely be able to handle millions.  I've personally only scaled up to ~100,000 goroutines on a single machine but I plan on pushing this.
Conclusion

Go is a breath of fresh air.  Go is what I always wanted from concurrent programming.  I feel like I've found a secret weapon, something that will give me an unfair advantage.
Sure it's true that I've only written ~700 lines of Go total in my life and probably 10k lines of Scala, so I've probably seen the worst of Scala and haven't yet hit the worst of Go.
I'm optimistic, I'm full of hope, and I feel empowered by Go!