Skip to content

Instantly share code, notes, and snippets.

@nikomatsakis nikomatsakis/0.md Secret
Last active Aug 29, 2015

Embed
What would you like to do?
The downsides of negative reasoning

I recently realized that our current trait system contains a forward compatibility hazard concerned with negative reasoning. The TL;DR is that negative reasoning (that is, the compiler saying that "doing X is legal if the trait T is NOT implemented for some type U") can easily make it impossible to add impls of any traits in a backwards compatible way. The most obvious example of negative reasoning are [negative trait bounds][586], which have been proposed in a rather nicely written RFC. However, even without negative bounds, the trait system as currently implemented already has some amount of negative reasoning, in the form of the coherence system.

There is a fair amount of background material here. Therefore I've broken this gist up into various sections, each in its own file:

  1. The problem: explains why negative reasoning is dangerous.
  2. A simple fix: a simple proposal that we can use to avoid the problem. This solution is somewhat limiting and we may wish to evolve it to more comprehensive solutions in the longer term.
  3. Alternatives: other avenues that I explored and which may be the kernel of a more permissive solution.

I plan on opening an RFC regarding the simple solution shortly.

I recently realized that our current trait system contains a forward compatibility hazard concerned with negative reasoning. The TL;DR is that negative reasoning (that is, the compiler saying that "doing X is legal if the trait T is NOT implemented for some type U") can easily make it impossible to add impls of any traits in a backwards compatible way. The most obvious example of negative reasoning are negative trait bounds, which have been proposed in a rather nicely written RFC. However, even without negative bounds, the trait system as currently implemented already has some amount of negative reasoning, in the form of the coherence system.

A goal

Let me start out with an implicit premise. I think it's important that we be able to add impls of existing traits to existing types without breaking downstream code (that is, causing it to stop compiling, or causing it to radically different things). Let me give you a concrete example. libstd defines the Range<T> type. Right now, this type is not Copy for various good reasons. However, we might like to make it Copy in the future. It feels like that should be legal. However, as I'll show you below, this could in fact cause existing code not to compile. I think this is a problem.

Negative reasoning in coherence today, the simple case

"Coherence" refers to a set of rules that Rust uses to enforce the idea that there is at most one impl of any trait for any given set of input types. Let me introduce an example crate hierarchy that I'm going to be coming back to throughout the gist:

libstd
  |
  +-> lib1 --+
  |          |
  +-> lib2 --+
             |
             v
            app

This diagram shows that four crates: libstd, two libraries (creatively titled lib1 and lib2), and an application app. app uses both of the libraries (and, transitively, libstd). The libraries are otherwise defined independently from one another., We say that libstd is a parent of the other crates, and that lib[12] are cousins.

OK, so, imagine that lib1 defines a type Carton but doesn't implement any traits for it. This is a kind of smart pointer, like Box.

// In lib1
struct Carton<T> { ... }

Now imagine that the app crate defines a type AppType that uses the Debug trait.

// In app
struct AppType { ... }
impl Debug for AppType { ... }

At some point, app has a Carton<AppType> that it is passing around, and it tries to use the Debug trait on that:

// In app
fn foo(c: Carton<AppType>) {
    println!("foo({:?})", c); // Error
    ...
}

Uh oh, now we encounter a problem because there is no impl of Debug for Carton<AppType>. But app can solve this by adding such an impl:

// In app
impl Debug for Carton<AppType> { ... }

You might expect this to be illegal per the orphan rules, but in fact it is not, and this is no accident. We want people to be able to define impls on references and boxes to their types. That is, since Carton is a smart pointer, we want impls like the one above to work, just like you should be able to do an impl on &AppType or Box<AppType>.

OK, so, what's the problem? The problem is that now maybe lib1 notices that Carton should define Debug, and it adds a blanket impl for all types:

// In lib1
impl<T:Debug> Debug for Carton<T> { ... }

This seems like a harmless change, but now if app tries to recompile, it will encounter a coherence violation.

What went wrong? Well, if you think about it, even a simple impl like

impl Debug for Carton<AppType> { ... }

contains an implicit negative assertion that no ancestor crate defines an impl that could apply to Carton<AppType>. This is fine at any given moment in time, but as the ancestor crates evolve, they may add impls that violate this negative assertion.

Negative reasoning in coherence today, the more complex case

The previous example was relatively simple in that it only involved a single trait (Debug). But the current coherence rules also allow us to concoct examples that employ multiple traits. For example, suppose that app decided to workaround the absence of Debug by defining it's own debug protocol. This uses Debug when available, but allows app to add new impls if needed.

// In lib1 (note: no `Debug` impl yet)
struct Carton<T> { ... }

// In app, before `lib1` added an impl of `Debug` for `Carton`
trait AppDebug { ... }
impl<T:Debug> AppDebug for T { ... } // Impl A

struct AppType { ... }
impl Debug for AppType { ... }
impl AppDebug for Carton<AppType> { ... } // Impl B

This is all perfectly legal. In particular, implementing AppDebug for Carton<AppType> is legal because there is no impl of Debug for Carton, and hence impls A and B are not in conflict. But now if lib1 should add the impl of Debug for Carton<T> that it added before, we get a conflict again:

// Added to lib1
impl<T:Debug> Debug for Carton<T> { ... }

In this case though the conflict isn't that there are two impls of Debug. Instead, adding an impl of Debug caused there to be two impls of AppDebug that are applicable to Carton<AppType>, whereas before there was only one.

Negative reasoning from OIBIT and RFC 586

The conflicts I showed before have one thing in common: the problem is that when we add an impl in the supercrate, they cause there to be too many impls in downstream crates. This is an important observation, because it can potentially be solved by specialization or some other form conflict resolution -- basically a way to decide between those duplicate impls (see below for details).

I don't believe it is possible today to have the problem where adding an impl in one crate causes there to be too few impls in downstream crates, at least not without enabling some feature-gates. However, you can achieve this easily with OIBIT and RFC 586. This suggests to me that we want to tweak the design of OIBIT -- which has been accepted, but is still feature-gated -- and we do not want to accept RFC 586 (at least not without further thought).

I'll start by showing what I mean using RFC 586, because it's more obvious. Consider this example of a trait Release that is implemented for all types that do not implement Debug:

// In app
trait Release { ... }
impl<T:!Debug> Release for T { ... }

Clearly, if lib1 adds an impl of Debug for Carton, we have a problem in app, because whereas before Carton<i32> implemented Release, it now does not.

Unfortunately, we can create this same scenario using OIBIT:

trait Release for .. { ... }
impl<T:Debug> !Release for T { ... }`

In practice, these sorts of impls are both feature-gated and buggy (e.g. #23072), and there's a good reason for that. When I looked into fixing the bugs, I realized that this would entail implementing essentially the full version of negative bounds, which made me nervous. It turns out we don't need conditional negative impls for most of the uses of OIBIT that we have in mind, and I think that we should forbid them before we remove the feature-gate.

Still to come.

OK, so this file laid out the problem. The other files in this gist are concerned with exploring possible solutions that I see, starting with a simple fix and then looking at other future solutions I investigated.

A proto-RFC for a proposed solution

Summary

This RFC proposes two rule changes:

  1. Modify the orphan rules so that impls of remote traits require a local type that is either a struct/enum/trait defined in the current crate LT = LocalTypeConstructor<...> or a reference to a local type LT = ... | &LT | &mut LT.
  2. Restrict negative reasoning so it too obeys the orphan rules.

Motivation

The current orphan rules are oriented around allowing as many remote traits as possible. As so often happens, giving power to one party (in this case, downstream crates) turns out to be taking power away from another (in this case, upstream crates). The problem is that due to coherence, the ability to define impls is a zero-sum game: every impl that is legal to add in a child crate is also an impl that a parent crate cannot add without fear of breaking downstream crates. A detailed look at these problems is presented here; this RFC doesn't go over the problems in detail, but will reproduce some of the examples found in that document.

This RFC proposes a shift that attempts to strike a balance between the needs of downstream and upstream crates. In particular, we wish to preserve the ability of upstream crates to add impls to traits that they define, while still allowing downstream creates to define the sorts of impls they need.

While exploring the problem, we found that in practice remote impls almost always are tied to a local type or a reference to a local type. For example, here are some impls from the definition of Vec:

// tied to Vec<T>
impl<T> Send for Vec<T>
    where T: Send

// tied to &Vec<T>
impl<'a,T> IntoIterator for &'a Vec<T>

On this basis, we propose that we limit remote impls to require that they include a type either defined in the current crate or a reference to a type defined in the current crate. This is more restrictive than the current definition, which merely requires a local type appear somewhere. So, for example, under this definition MyType and &MyType would be considered local, but Box<MyType>, Option<MyType>, and (MyType, i32) would not.

Furthermore, we limit the use of negative reasoning to obey the orphan rules. That is, just as a crate cannot define an impl Type: Trait unless Type or Trait is local, it cannot rely that Type: !Trait holds unless Type or Trait is local.

Practical effect

Effect on parent crates

When you first define a trait, you must also decide whether that trait should have (a) a blanket impls for all T and (b) any blanket impls over references. These blanket impls cannot be added later without a major vesion bump, for fear of breaking downstream clients.

Here are some examples of the kinds of blanket impls that must be added right away:

impl<T:Foo> Bar for T { }
impl<'a,T:Bar> Bar for &'a T { }

Effect on child crates

A prototype of these rules have been implemented. In compiling libstd and librustc, exactly two impls were found to be illegal, both of which followed the same pattern:

struct LinkedListEntry<'a> {
    data: i32,
    next: Option<&'a LinkedListEntry>
}

impl<'a> Iterator for Option<&'a LinkedListEntry> {
    type Item = i32;

    fn next(&mut self) -> Option<i32> {
        if let Some(ptr) = *self {
            *self = Some(ptr.next);
            Some(ptr.data)
        } else {
            None
        }
    }
}

The problem here is that Option<&LinkedListEntry> is no longer considered a local type. A similar restriction would be that one cannot define an impl over Box<LinkedListEntry>; but this was not observed in practice.

Both of these restrictions can be overcome by using a new type. For example, the code above could be changed so that instead of writing the impl for Option<&LinkedListEntry>, we define a type LinkedList that wraps the option and implement on that:

struct LinkedListEntry<'a> {
    data: i32,
    next: LinkedList<'a>
}

struct LinkedList<'a> {
    data: Option<&'a LinkedListEntry>
}

impl<'a> Iterator for LinkedList<'a> {
    type Item = i32;
    
    fn next(&mut self) -> Option<i32> {
        if let Some(ptr) = self.data {
            *self = Some(ptr.next);
            Some(ptr.data)
        } else {
            None
        }
    }
}

Detailed Design

Proposed orphan rules

Given an impl impl<P1...Pn> Trait<T1...Tn> for T0, either Trait must be local to the current crate, or:

  1. At least one type must meet the LT pattern defined above. Let Ti be the first such type.
  2. No type parameters P1...Pn may appear in the type parameters that precede Ti (that is, Tj where j < i).

Type locality and negative reasoning

Currently the overlap check employs negative reasoning to segregate blanket impls from other impls. For example, the following pair of impls would be legal only if MyType<U>: !Copy for all U (the notation Type: !Trait is borrowed from RFC 586):

impl<T:Copy> Clone for T {..}
impl<U> Clone for MyType<U> {..}

This proposal places limits on negative reasoning based on the orphan rules. Specifically, we cannot conclude that a proposition like T0: !Trait<T1..Tn> holds unless T0: Trait<T1..Tn> meets the orphan rules as defined in the previous section.

In practice this means that, by default, you can only assume negative things about traits and types defined in your current crate, since those are under your direct control. This permits parent crates to add any impls except for blanket impls over T, &T, or &mut T, as discussed before.

Effect on ABI compatibility and semver

We have not yet proposed a comprehensive semver RFC (it's coming). However, this RFC has some effect on what that RFC would say. As discussed above, it is a breaking change for to add a blanket impl where a type parameter that meets the "local type" grammar is referenced before any local types. Note that this can only occur in the same crate where a trait is defined.

Examples:

Drawbacks

The primary drawback is that downstream crates cannot write an impl over types other than references, such as Option<LocalType>. This can be overcome by defining wrapper structs (new types), but that can be annoying.

Alternatives

  • Status quo. In the status quo, the balance of power is heavily tilted towards child crates. Parent crates basically cannot add any impl for an existing trait to an existing type without potentially breaking child crates.

  • Annotations on traits. Initial designs included various annotations intended to distinguish "extensible" traits (that is, traits that retain the freedom to add blanket impls) from "inextensible" traits. However, this design was both more complicated (one had to classify each trait as extensible/inextensible) and seemed to strike a less satisfactory balance. In particular, virtually every trait had to be marked inextensible was so marked because of a desire to implement the trait for &T (with the exception of the linked list iterator example given above). Basically, the design proposed here seems simpler and equally effective.

  • Annotations on types. It might make sense to somehow annotate basic types like Option and Box to indicate that Option<LocalType> can also be considered local. This is a bit odd because any trait that is defined downstream of such a type must decide immediately whether or not to add a blanket impl; in this RFC, that decision is only needed for T, &T, and &mut T (and it's usually a fairly easy trait: a blanket impl impl is appropriate for single-method traits and not much else). But as the set of types grows, it's increasingly likely that one will be overlooked. On the other hand, it would be nice for downstrates to be able to consider Option<T> or Result<T> to be local.

  • Specializations, negative impls, and contracts. The gist referenced earlier includes a section covering various alternatives that I explored which came up short. These include specialization, explicit negative impls, and explicit contracts between the trait definer and the trait consumer.

Unresolved questions

None.

Alternatives

The scheme I sketched out in the previous file seems to work pretty well in practice. But I've also given a lot of thought to other, more general schemes, and I wanted to record those investigations and their outcomes here for reference. I see a couple of possible strategies:

  1. Specialization is a very powerful technique that almost helps with this problem, but in fact does not.
  2. Using explicit negative impls seemed like a good idea but was not.
  3. Contracts seem like a complicated way to solve the problem; perhaps they can be simplified or streamlined, however.

Specialization

The basic idea of specialization is to say that overlapping impls are OK, so long as one is more specific than the other. My original motivation in exploring specialization was for efficiency: sometimes we want to implement traits in a generic way most of the time, but want to have more efficient impls for particular cases. For example, the zip method on iterators might be more efficient if it knows that both iterators being zipped have a known length that can be checked ahead of time.

However, another use of specialization is for sidestepping coherence limitations. This is because specialization provides a way for the compiler to permit overlapping impls that would otherwise be illegal. In fact, for a time I thought that specialization could completely solve the forwards compatibility problem, but it turns out that this is not true due to associated types. For the moment, though, let's ignore associated types, and I'll show how specialization can help.

For now let's assume a simple, uniform specialization design in which impl I is more specific than some other impl J if the types in impl J can be made to match impl I but not vice versa (I don't think that this is the design I would necessarily advocate, but it is the kind of design we would use to silently, and backwards compatibilty, upgrade existing Rust code to avoid forwards compatibility hazards). Let me give you an example:

impl<T> Foo for Box<T> // A
impl Foo for Box<i32>  // B

Here impl B is more specific because you can make Box<T> match Box<i32> by saying that T=i32, but there is no substitution you can apply to make Box<i32> match Box<T> (for some fresh T).

Note that this definition does not consider the where-clauses on the impls. In particular, neither of these two impls are more specific than the other:

impl<T:PartialOrd> Foo for T
impl<U:Ord> Foo for U

If we wished to support cases like this, we would want to have additional, finer-grained ordering mechanisms. We might do this with an explicit precedence, for example, or some other means. But the key point is that this would be a subordering, of lesser importance than the basic ordering I gave above. I'll ignore it for now and just assume that cases like this yield a coherence violation (as they do today).

Specialization is interesting because it gives us a means to resolve conflicts that occur between "blanket" impls (those that involve type parameters) and more specific ones. This means that specialization can help to sidestep "too many impls" style conflicts. This is great because, as we saw before, it is only with explicit negative bounds that we start to see conflicts due to having too few impls.

How specialization can help

Let's work through some of the examples we saw earlier to see what I mean and how it would work. In the first example, the lib1 and app crates defined the following types and impls:

// In lib1
struct Carton<T> { }
impl<T> Debug for Carton<T> { } // added later

// In app
struct AppType { }
impl Debug for AppType { }
impl Debug for Carton<AppType> { }

This caused a conflict because app already defined Debug for Carton<AppType>. Under this specialization-based scheme, however, the impl from app would take precedence because it is more specific than the blanket impl provided by lib1.

Similarly, in the second example, we had the following types and impls:

// In lib1
struct Carton<T> { }
impl<T:Debug> Debug for Carton<T> { } // added later

// In app
trait AppDebug { ... }
impl<T:Debug> AppDebug for T { ... } // Impl A

struct AppType { ... }
impl Debug for AppType { ... }
impl AppDebug for Carton<AppType> { ... } // Impl B

Here we had a conflict because both impl A and impl B could be used to implement AppDebug for Carton<AppType>. Under a specialization-based scheme, however, impl B would be considered more specific than impl A, and hence it would take precedence.

Specialization has another nice property as well. So far I've been focusing on the question of whether adding an impl of Debug in lib1 causes app to stop compiling. But it would also be nice if we could be sure that app would not only compile, but it would do the same thing as before (that is, it would not use the new impls we added to lib1). After all, we know that it doesn't need those impls, because it compiled already without them. Specialization in fact gives us this guarantee, because we know that the new impls that were added will have lower precedence.

If we continue to ignore associated types, we can argue that specialization actually solves all "too many impls"-style conflicts. The basic argument is that the only way to have too many impls is if one of them is more specific than the other, and hence specialization gives us a clear rule about which to pick. (I had a more elaborate proof, but since I realized that associated types cause this scheme not to work, I'll omit it.)

Limitations of specialization: associated types

Once we add associated types into the mix, however, the specialization design I sketched above starts to fall apart. In particular, the problems arise because that design made it possible for any blanket impl to be specialized. This makes it very hard to resolve associated types. To see what I mean, consider this impl:

impl<T> Iterator for Vec<T> {
    type Item = T;
    ...
}

Now, if all blanket impls can be implicitly specialized, there is no way for us to know that there doesn't exist some other, more specialized impl that uses a different binding for Item:

struct Foo;
impl Iterator for Vec<Foo> {
    type Item = Option<i32>;
    ...
}

This means that if the type checker is asked to resolve <Vec<T> as Iterator>::Item, where T is any type involving a type parameter, it can't do so. We'd really like that to resolve to T, but in fact we have to be conservative: after all, what if T=Foo? In that case, <Vec<T> as Iterator>::Item would not be Foo (T) but Option<Foo> (Option<T>).

This implies that we cannot just silently make all impls specializable without breaking a lot of existing code that relies on being able to resolve associated types.

I tried various experiments to see if we could resolve this problem. For example, I experimented with a "forward proofing" rule which required that, if one impl is more specialized than another according to this definition, then all of their associated type bindings must match. This was an experiment to see whether it was possible to write some kind of "rules" for how associated types should be based on their inputs. Unfortunately, this proved to be untenable. There are numerous trait patterns that we use all the time where this constraint does not hold.

It's also worth pointing out that adding a blanket impl for a trait with no associated types can still cause indirect associated type conflicts. To see what I mean, let's revisit the "indirect" example we saw before, but with an associated type added in:

// In lib1
struct Carton<T> { }
impl<T:Debug> Debug for Carton<T> { } // Impl C (added later)

// In app
trait AppDebug {
    type Output;
}
impl<T:Debug> AppDebug for T { // Impl A
    type Output = ();
}

struct AppType { ... }
impl AppDebug for AppType {
    type Output = i32;
    ...
}
impl AppDebug for Carton<AppType> { // Impl B
    type Output = i32;
    ...
}

As you may recall, the problem is that, initially, there is no Impl C. At that time, we can resolve that <Carton<AppType> as AppDebug>::Output = i32, and that, for some T:Debug, <T as AppDebug>::Output = (). But once we add Impl C, both impls A and B apply, and they have different values for Output. (In fact, so long as we continue to be able to resolve associated types from specialiable impls with precision, specialization doesn't really wind up helping at all, because one can mechanically translate all the coherence conflicts I talked about before into associated type conflicts.)

Limitations of specialization: negativity

Another limit of specialization is that it cannot resolve the problem of having too few impls. This isn't a problem yet, but it would be if we adopted (some version of) RFC 586. To see why, consider the negative example based on RFC 586:

trait Release { }
impl<T:!Debug> Release for T { }

If we make some type T implement Debug when it didn't before, we still cause it to not implement Release, even with specialization.

Prognosis

My conclusion is that specialization is not capable of resolving forwards compatibility concerns in general, but it still seems like something that will be very useful in practice. Probably, though, if/when we want to introduce specialiation, we will need to use some kind of explicit opt-in where you mark a trait or impl as specializable, so that associated type resolution knows to be careful. If we marked traits as implicitly specializable, perhaps that would enable us to create a system that was immune to forwards compatibility risk; but such a system would be more limited in terms of how you can reason about associated types.

Negative impls

As part of the OIBIT work we have added (feature-gated) negative impls. Briefly I thought that we could use these to resolve our dilemnas by saying that we would only consider a negative proposition like T: !Foo to hold if there was an explicit negative impl. The problem becomes apparent if you imagine trying to encode the rules I described before in this scheme.

For example, one thing we would like is that people can define impl<T> IntoIterator for &Vec<T>. But for this to work, we must know that there will be no blanket impl of IntoIterator for &T. We might try to write:

impl<'a,T> !IntoIterator for &'a T

But this actually doesn't say quite what we want to say. It says that there is no impl of IntoIterator for &'a T -- but that then contradicts the impl for &Vec<T> that the user was trying to add. So in fact the proposition language turns out to be more subtle than this. Which leads us to next file.

Contracts to limit future expansion

The other approach we could take is to make a kind of contract between the trait author and the trait consumer about what kinds of impls may be added in the future. This relates to the idea that I sketched out above for "orphan rules for negative reasoning". Basically, we could permit looser orphan rules if we know that the trait will not expand in certain ways.

This would allow us to lift some of the restrictions from the current proposal. For example, we might declare that there will be no more blanket impls over Option<T>, etc. The question of who is allowed to specify these kinds of contracts also raises similar coherence concerns. Presumably the contracts would have to follow some sort of orphan rules of their own.

Overall I'm not wild about this direction. It feels complicated and requires people to learn new things which don't really add a lot of cool expressive power. Specialization seems a lot more promising, except that it has limitations around associated types. So it seems that more time and experimentation is needed to see how limiting the earlier proposal turns out to be, and what kinds of patterns we exactly want to enable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.