Skip to content

Instantly share code, notes, and snippets.

@djspiewak
Created November 28, 2016 16:21
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save djspiewak/9f6feadab02b16829c41484b394d16e4 to your computer and use it in GitHub Desktop.
Save djspiewak/9f6feadab02b16829c41484b394d16e4 to your computer and use it in GitHub Desktop.

Coherence Domains in Scala

Miles probably already thought of this idea and found some problem with it, but I'm going to go ahead and propose it anyway so that he has an easy way of telling me it's a bad idea. :-)

In order to solve the problem of typeclass coherence in a system which supports local instances (Scala), I propose that we introduce a tagging type – which I call a coherence domain – on all implicit values, regardless of their declaration scope. This tagging type could be encoded as a type member on the type of the implicit value itself, but that is awkward, exposes some details of the machinery in user-facing APIs, and also requires a macro to fully implement. This proposal requires a small, backwards-compatible change to the type system, an unambiguous and consistent change to the syntax (also backwards-compatible), and a minor revision to the implicit resolution rules.

Domains

Declaration

Currently, all implicit values are declared in the following form:

implicit val foo: Bar = ???

When multiple implicit values of type Bar are in scope, they are (rightly) considered ambiguous, since the implicit resolution rules cannot just arbitrarily pick one or the other (since they may have different semantics). I'm proposing that we enrich the above syntax to add a type parameter to the implicit modifier. In all of my examples, I'm going to use braces, because I think it feels syntactically consistent with private/protected modifiers, but parentheses would also be acceptable:

implicit[U] val foo: Bar = ???

This does not introduce a new type variable. Rather, U is assumed to be a type that is in scope. There is nothing special about these types, and I would expect that most of them would be defined in the following way:

object U
type U = U.type

What this declaration syntax achieves is declaring to the compiler that the foo implicit value has type Bar in domain U. One can view it as analogous to the following declaration:

implicit val foo: Bar { type Domain = U } = ???

The difference is two-fold:

  • We don't have to add a tagging type (Domain) to Bar
  • The compiler can be aware of this mechanism at a deeper level

Type varaibles may also be used to instantiate implicit domains:

def baz[U] = {
  implicit[U] val foo: Bar = ???
}

As I said: it's just a type (of kind *).

If the domain is left unspecified, then the implicit is considered to be in a fresh domain which is existential to that declaration:

// still valid!
implicit val foo: Bar = ???

The above is logically equivalent to the following:

implicit[<fresh>] val foo: Bar = ???

Where <fresh> denotes the synthesis of a fresh singleton type.

Reference

Whenever an implicit value is referenced in an implicit parameter block, we will now have the option of pulling out its domain:

object U
type U = U.type

def baz(implicit[U] foo: Bar) = ???

This declaration is analogous to the following and has equivalent semantics:

object U
type U = U.type

def baz(implicit foo: Bar { type Domain = U }) = ???

When multiple implicit parameters are declared, they all share the domain constraint:

def baz(implicit[U] foo: Bar, bip: Bip) = ???

This selects implicit values of type Bar and Bip, both of which must be declared in the domain U. If there is an implicit value of type Bar in scope, but it has a different domain (or no domain), then it is considered a distinct type and not selected.

It may be desirable to declare an implicit block as taking multiple parameters, all of which have the same domain, without actively constraining that domain to one or the other. This may be accomplished in one of two (consistent) ways. First, one may simply bind the domain to a type parameter:

def baz[C](implicit[C] foo: Bar, bip: Bip) = ???

Both foo and bip must have the same domain, which is bound to type C. As with any type parameter, this is a valid type and may be used in subsequent computation just as with any other type.

This produces a user-visible change in the baz type arity though, as now the function has an added type parameter. This may be avoided in cases where it is unnecessary to capture the domain type by using underscore (_):

def baz(implicit[_] foo: Bar, bip: Bip) = ???

As with every time the implicit[...] syntax is employed, both foo and bip must have the same coherence domain. However, that coherence domain is allowed to remain anonymous and is not captured by any type in scope. Thus, the only thing declared by this signature is the fact that the two coherence domains (of Bar and Bip) are consistent, but there is no constraint on how they are consistent.

As coherence domains are just types, it is naturally possible to compute a desired coherence domain in a prior parameter block:

def baz[C](c: C)(implicit[C] foo: Bar, bip: Bip) = ???

Additionally, all the standard rules for type constraints apply:

def baz[C <: Foo](implicit[C] foo: Bar, bip: Bip) = ???

def baz(implicit[_ <: Foo] foo: Bar, bip: Bip) = ???

I have no idea why that sort of feature would be useful, but there it is. In general, I wouldn't recommend utilizing subtyping with coherence domains, since it directly defeats the primary purpose of the mechanism.

As with declarations, the "bare" implicit modifier is still supported, and has a well-defined semantic:

def baz(implicit foo: Bar, bip: Bip) = ???

This declaration is logically equivalent to the following:

def baz(implicit[<fresh>] foo: Bar, bip: Bip) = ???

Where the <fresh> "syntax" indicates a new type variable for each parameter. In other words, the bare implicit modifier, when used in a def, indicates a set of implicit values which have individually-existential coherence domains. No constraint is defined. Thus, the bare implicit syntax in this proposal is semantically equivalent to the same syntax in current Scala.

Resolution

As mentioned, the implicit resolution algorithm must ensure that the domain(s) of automatically selected implicit values match the declared domain of the declaration site. Thus:

implicit[A] val foo1: Bar = ???
implicit[B] val foo2: Bar = ???

foo1 and foo2 are not ambiguous relative to each other, so long as the def-site declares a domain. Let's assume we have the following function:

def implicitC[C, A](implicit[C] a: A): A = a

If we summon the Bar implicit for a given domain, we will get that implicit, unambiguously:

implicitC[A, Bar]  // => foo1
implicitC[B, Bar]  // => foo2

However, if we summon without declaring a domain, the results are ambiguous:

implicitly[Bar]    // error!

Of course, implicitly is declared with the "bare" implicit modifier, but it is equivalent to the results we would see in any other unconstrained domain scenario:

def thing1[C](implicit[C] bar: Bar) = ???
def thing2(implicit[_] bar: Bar) = ???

// both of the following fail to compile (ambiguous implicits)
thing1
thing2

Thus, domain constraints are part of the implicit type, and while they can resolve ambiguities, they do not guarantee it.

Coherence

Thus far, nothing we have done actually produces a significant gain in expressiveness, nor does it solve the underlying problem: multiple implicit values of the same type computed through different paths which are mutually ambiguous. To solve this problem, we need to make one more change to the implicit resolver: implicits of the same type and the same domain in the same scope are considered to be unambiguous, and the precise value selected is unspecified:

implicit[U] val foo1: Bar = ???
implicit[U] val foo2: Bar = ???

implicitC[U, Bar]   // => foo1 (or foo2, it doesn't matter)

The implicitC summon is not ambiguous. Both foo1 and foo2 are declared to have the same coherence domain – in this case, U – and thus they are not mutually ambiguous.

In more semantic terms, this means that a coherence domain is effectively declaring that any types derived from this implicit type will have substitutable semantics as the same type derived from other implicit types in the same domain. Thus, a coherence domain is a declarative form of local coherence.

Objections

The primary objection I can see to this proposal is the fact that this is effectively adding information to a type, but only when the value of that type is declared implicit. As an example of where this is a little weird, consider explicitly passing implicit parameters:

def foo(implicit[_] bar: Bar, baz: Baz) = ???

val bar: Bar = ???
val baz: Baz = ???

foo(bar, baz)   // wut?

The only sane thing to do here is to reject such invocations: implicit parameters which have a non-fresh (i.e. non "bare implicit") domain cannot be passed explicitly. On the surface, this is a somewhat frustrating asymmetry with "normal" types and values, but I think you can make an argument that it is justified. Implicit values are weird from a domain design standpoint. They cannot and should not be treated as just another form of explicit value, and trying to do this is what creates incoherence in the first place.

Another objection is the fact that coherence is not enforced. By that I mean, a coherence domain is declaring that equivalent types derived within the same domain have substitutable semantics, but there is no enforcement of this fact. It is a declaration and an assertion, and nothing more. I don't see a way around this problem, since substitutability is simply not generally verifiable in calculii as powerful as Scala. This problem is literally equivalent to the fact that vacuous definitions are not forbidden:

def allImplyFoo[A](a: A): Foo = allImplyFoo(a)

Clearly, the above definition is a "lie", but Scala does not and cannot forbid it. For the same reason, Scala does not and cannot forbid invalid coherence domains. If you declare two equivalent types with non-substitutable semantics to be members of the same domain, your call-site semantics will be non-deterministic.

Random Benefits

As a neat sidebar, and perhaps as a sanity check to make sure this proposal does indeed solve the problems I claim it solves, this proposal provides a nifty solution to the "ISet problem". Specifically, the ISet problem is the fact that all of the operations on ISet take an Order constraint, but there is no guarantee that those constraints are coherent. If the user has different constraints on the same type in different scopes, they may "lose" values within the set, and may even entirely corrupt the data structure. For example:

val orig: Order[Int] = Order[Int]
val nums: ISet[Int] = ISet(1, 2, 3, 4)

{
  implicit val psyche_!: Order[Int] = orig.reverse

  nums(1)   // => false (probably?)
}

With coherence domains, ISet can reflect this need for cross-member coherence in its type signature. Specifically:

class ISet[C, A] {
  def +(a: A)(implicit[C] ord: Order[A]): ISet[C, A] = ???
  def apply(a: A)(implicit[C] ord: Order[A]): Boolean = ???
}

object ISet {
  def apply[C, A](as: A*)(implicit[C] ord: Order[A]): ISet[C, A] = ???
}

In our problematic example, the psyche_! value would not be considered for implicit resolution at the nums(1) call site, since it does not share the declared domain from the nums declaration. Thusly:

val orig: Order[Int] = Order[Int]
val nums: ISet[scalaz.U, Int] = ISet(1, 2, 3, 4)

{
  implicit val psyche_!: Order[Int] = orig.reverse

  nums(1)   // => true
}

The nums(1) call site would ignore psyche_!, since it does not have the scalaz.U coherence domain.

In practice, I would expect that structures like ISet would require a concrete coherence domain (e.g. scalaz.U), rather than inferring it, to avoid adding a phantom type to the public signature.

@Blaisorblade
Copy link

This looks cool. It's almost the beginning of a paper, though "Random benefits" should be evaluation, and I think you should also evaluate the modularity benefits.

However, it's yet incomplete. Even if you want all your instances to be coherent, you still want to deal with orphans. Otherwise, you end up with situations like the Haskell lens library depending on half of Hackage just to supply instances for them—which IMHO is, at best, a necessary evil if you care about modularity. At least, requiring Kmett maintainers to bless other packages as worth depending over isn't modular, and adding them as dependencies to users of lens is waste. I strongly expect Odersky cares about modularity.

There are also use cases for non-coherent implicits, and it would be good to show they'd still work well under this proposal. In fact, I think the proposal needs some tuning to preserve expressiveness—I have a strong intuition that if I want to override some instance from a library while keeping others, I need a hierarchy of domains. Probably using inheritance. Since domains promise coherence, they'd need to disable instances they want to override—which is problematic since domains are open.

The alternatives this competes with are Okasaki's scads, that is basically ML modules (https://github.com/chrisokasaki/scads/blob/master/design/heaps.md), which I still find more compelling, though it needs some education to use, and Modular Type Classes (+ applicative functors), which does provide modularity and local coherence but still requires lots of overhead.

@Blaisorblade
Copy link

@djspiewak Regarding orphans, I'm not saying this is not covered by your proposal—I'm asking you to discuss whether it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment