cleishm/effective_scala.md

## effective_scala.md

      
    Raw
  

              effective_scala.md
            
          
    Effective Scala by @cleishm


Introduction
Formatting

Spacing
Naming
Imports
Pattern Matching
Method Invocation


Types

General parameters, specific results
Prefer methods over functions


Implicits

Adapting or Extending objects ("enrich my library" pattern)
Lifting to a more general type


Functional over Procedural

Introduction

Over the last year I've had the dubious pleasure of learning more about the Scala programming language, and spending some time writing real code in it.
In doing so, I've formed a few thoughts and opinions about how to effectively work in the Scala language. A lot of these thoughts and opinions are already very well describe in the excellent "Twitter Effective Scala", after which this gist is cheekily named, and from the Scala Style Guide. I haven't gone anywhere near as far into Scala as the authors of these, so I strongly recommend reviewing theirs for a more in-depth discussion. I'm just going to mention a few things that stood out to me in my exploration.
Formatting

Follow the Scala Style Guide. It's pretty pragmatic.
Below I'll just describe some things that I might clarify further or vary slightly from it.
Spacing


Use : Type (one space) for type specifications. ie. def foo(name: String), val x: Foo.
Don't use braces around method/function bodies unless they must be procedural:

This is ok:
def foo(bar: String: String) =
  bar.reverse

This is not ok:
def foo(bar: String): String = {
  bar.reverse
}


Vary vertical alignment to ensure diffs are visually minimized. This means occasionally using more than 2 spaces for an indent (this is actually suggested in the Scala Style Guide also, but worth clarification).

This is ok:
  one
| two
| three

This is not ok:
one
| two
| three


Indent expressions continued over multiple lines unless that continuation reflects a procedural flow.

This is ok:
doOneThing andThen
doAnotherThing andThen
doLastThing

This is not ok:
doOneThing andThen
  doAnotherThing andThen
  doLastThing

Naming

Scala is a "Object-Oriented Meets Functional". It is a strictly object-oriented language, but also has a strong focus on functions and functional programming. It is helpful then to have clear naming rules that differentiate which concept is most prevalent. A great guide for these is the scala language naming conventions themselves. If you don't feel like reading that, here's a quick summary:

For classes, traits and types, use camelCase style with the very first letter of the name capitalized (ie. trait Parser, case class Statement(), type Rewriter = ...).
For objects, use the same convention as classes except when attempting to mimic a package or a function. ie.

object replaceUnderscores {
  def apply(s: String): String = s.replace('_', ' ')
}

Because scala is strictly object-oriented, all functions are also objects. This style makes it helpful for the developer to be able to differentiate the way the object should be used - as an object (by invoking methods upon it) or as a function, by applying it.
(Caution though: perhaps consider if the function should be a method instead).

Start methods and functions with a lower case character, ie:

object Sample {
  def foo(): Boolean = true
}
case class AnotherClass() {
  def bar(n: String) = s"hello ${n}"
}

As an exception to the above, I've observed that a method that attempts to mimic a concrete concept in a DSL should also start with an upper case character. ie:
class Parser {
  def Space = rule { ... }
  def Match = rule { zeroOrMore(Space) ~ "MATCH" ~ Space ~ Pattern }
}

In this example, the Space method is mimicking a concrete concept - a parse rule - and using the camelCase style starting with a capital makes it more apparent and avoids confusing it for a more regular function (such as zeroOrMore(...), token, etc).
Imports


Order imports alphabetically
Except... put imports from the same project at the top
Use wildcards when many things from the same package are required, ie:

import org.neo4j.graphdb._


Use wildcards when it's reasonable that the class(es) in the file should work with anything from the imported package, ie:

import org.neo4j.cypher.internal.compiler.v2_1.commands._


Import the base of the local package and then use relative imports, ie:

import org.neo4j.cypher.internal.compiler.v2_1._
import commands._


Do not use relative imports from other packages
Import package names to qualify different 'types' of imports, ie:

import org.neo4j.cypher.internal.compiler.v2_1.ast
import scala.collection.mutable

Qualifiying the names makes is obvious to the reader which 'type' of thing is being used (e.g. mutable.Map, ast.Statement).
Pattern matching


Collapse matches wherever possible, ie:

seq.map {
  case Some(x) => 1
  case None => 0
}


Avoid vertically aligning the =>. When vertical alignment is used, changing any line in the pattern requires all other lines be reformatted. This makes it hard to see the significant change that occurred.

Method invocation

Scala allows methods to be invoked without the use of a period (.). I've found this is useful only for infix-functions (Arity-1), that are side-effect free.
This is ok
  val seenUser = myService hasSeenUser "sam"

This is not ok:
  myService deleteUser "joe"
  val userSet = myService handledUsers

See more examples in the Scala Style Guide - Method Invocation.
Types

General parameters, specific results

Be general in what parameter types a function accepts and specific in what type it returns. ie:
def process(things: Seq[String]): Vector = things.toVector.sorted

This allows the caller to behave the most appropriately with the value returned, given it knows the specific type information. Yet it also allows the caller to provide any type of sequence it likes (as the process method doesn't care).
However, for class methods, do this only when the method will not be overwritten (ie. in case classes). For methods of traits or abstract classes, be general in the parameters and in the return type.
Prefer methods over functions

Scala is an object-oriented language. So prefer methods over functions - whenever a function operates primarily on a specific type, make it a method of that type.
ie. avoid:
case class User(first: String, last: String)
case class Greetings() {
  def sayHi(u: User) = "Hello " + calculateFullName(u)

  private def calculateFullName(u: User) = u.first + " " + u.last
}

Instead, include the method (or val/lazy val) in the class itself:
case class User(first: String, last: String) {
  def fullName = first + " " + last
}
case class Greetings() {
  def sayHi(u: User) = "Hello " + u.fullname
}

If it's not possible to add the method directly to the class, or if the method only makes sense within a limited context (and shouldn't be available everywhere), then prefer to add it using the "enrich my library" pattern. For example, a module might want to calculate something specific to its domain:
object Links {
  implicit class IdentifiableUser(u: User) extends AnyVal {
    def notVeryOpaqueId = (u.first + u.last).reverse
  }
}

case class Links() {
  def userLink(u: User) = "http://my.service/" + u.notVeryOpaqueId
}

Implicits

As the twitter guide states succinctly: "Implicits are a powerful type system feature, but they should be used sparingly". They also point out some situations where it's definitely OK to use them:

Extending or adding a Scala-style collection
Adapting or extending an object (“pimp my library” pattern)
Use to enhance type safety by providing constraint evidence
To provide type evidence (typeclassing)
For Manifests

One thing I've found particularly confusing is the use of implicit parameters. Avoid them.
Adapting or extending objects ("enrich my library" pattern)

In my exploration of scala, I've not encountered most of these situations described above - except for the “enrich my library” pattern (aka "pimp my library"). This is a commonly used approach to achieve the clarity of method invocations (vs functions), even where the original type did not provide the required method. When following this approach, I've found that it's ideal to use Value Classes to avoid unnecessary allocation overhead, ie:
implicit class NameableUser(u: User) extends AnyVal {
  def fullName = u.first + " " + u.last
}

Lifting to a more general type

One other case (beyond the above) where I've found implicits useful is lifting to a more general type, rather than providing specializations. This helps to avoid duplicated code.
A couple of examples is helpful.
A(n overly) simplistic example

trait Service {
  def foo(msg: String)
  def foo(msg: Seq[String])
  def foo(msg: Int)
  def bar(msg: String)
  def bar(msg: Seq[String])
  def bar(msg: Int)
}

...
  service.foo("bar")
  service.foo(1)
  service.bar(Seq("foo", "bar"))
...

By using a general type for the msg, the numerous specializations can be eliminated:
trait Message
case class StringsMessage(ss: Seq[String]) extends Message
case class StringMessage(s: String) extends Message
case class IntMessage(int: Int) extends Message

trait Service {
  def foo(msg: Message)
  def bar(msg: Message)
}

But this requires explicitly constructing the general type, which is not always helpful due to the additional complexity it adds in the usage:
...
  service.foo(StringMessage("bar"))
  service.foo(IntMessage(1))
  service.bar(StringsMessage(Seq("foo", "bar")))
...

Instead, we can use implicit conversions to ensure the correct general type is provided:
object Message {
  implicit class StringsMessage(val ss: Seq[String]) extends AnyVal with Message
  implicit class StringMessage(val s: String) extends AnyVal with Message
  implicit class IntMessage(val int: Int) extends AnyVal with Message
}
trait Message extends Any

trait Service {
  def foo(msg: Message)
  def bar(msg: Message)
}

...
  service.foo("bar")
  service.foo(1)
  service.bar(Seq("foo", "bar"))
...

Whilst this is a good example to describe the concept, it's questionable how appropriate this pattern is in this case. A real example is more useful...
A real example

This example defines a general type Check, which is a specific function type and provides a composition method ifOkThen (this is similar to the andThen operation of Function1).
trait State extends Map[String, Any]

trait Check extends (State => Seq[String]) {
  def ifOkThen(next: Check): Check = new Check {
    def apply(s: State) = {
      val result = apply(s)
      if (result.nonEmpty)
        result
      else
        next.apply(s)
    }
  }
}

Usage may be as follows:
object firstCheck extends Check {
  def apply(s: State) = ...
}
object alwaysError extends Check {
  def apply(s: State) = Seq("something went wrong!")
}

object allChecks {
  def apply(state: State) = {
    val composedChecks = firstCheck ifOkThen alwaysError
    composedChecks(state)
  }
}

In many cases, the user of this pattern may have checks that are one-offs, and wish to use inline. For example:
    val composedChecks = firstCheck ifOkThen Seq("This should have failed the check!")

or
    val composedChecks = firstCheck ifOkThen ((s: State) => {
      if (state.contains("x"))
        Seq.empty
      else
        Seq("State shouldn't contain 'x'!")
    })

Allowing this overloaded usage could be done by having specializations of the composition method, ie. by adding def ifOkThen(err: Seq[String]) = ... and def ifOkThen(f: State => Seq[String]) = .... However this can get unweildy if there are multiple composition methods. Instead this can be achieved by having the composition methods take a single general type and making careful use of implicit conversions:
object Check {
  implicit class ErrorReturningCheck(errors: Seq[String]) extends Check {
    def apply(s: State) = errors
  }
  implicit class LiftedCheck(f: State => Seq[String]) extends Check {
    def apply(s: State) = f(s)
  }
}

Be sure to only declare the implicit class within the scope of the object you wish to convert to.
Functional over Procedural

Scala encourages a functional style which, due to its focus on data flow, makes heavy use of collections and the operators over them. Prefer to use this approach. I've also discovered a few preferences to make this effective.

Prefer immutability almost always. Avoid var.
Prefer to use flatMap, map, collect, etc, over explicit procedural code and procedural checks such as if.
Use Option[] as a typesafe alternative to null whenever possible. Note that Option is a collection.
Prefer to use the collection operations on Option[].

Do this:
val maybeUser: Option[User]
...
val name = maybeName.fold("unknown")(_.name)

instead of:
val name = maybeUser match {
  case Some(u) => u.name
  case None => "unknown"
}