DanielKeep/0000-named-impls.md Secret

## 0000-named-impls.md

      
    Raw
  

              0000-named-impls.md
            
          
Start Date: 2015-06-14
RFC PR: (leave this empty)
Rust Issue: (leave this empty)


TODO: Clarify why we're doing this.
TODO: Why not a private impl?  Make sure to emphasise readability concerns.
TODO: impls can be anywhere in local crates; damn.

Summary

This RFC proposes introducing "named impls" for the purpose of allowing external traits to be implemented for external types without violating coherence.
Prior Discussion


rust-lang/rfcs#493
rust-lang/rfcs#1053

Motivation

Rust relies heavily on traits as a basic unit of abstraction.  However, the current coherence rules introduce an extremely painful problem: external types failing to implement external traits.
This can take a number of forms.  There are two major classes, though:


A type failing to implement a trait which could have reasonably been implemented, but was not.


A type not implementing a trait which it could not know about, but which the user of the type is powerless to implement themselves due to the trait also being external.


There are two workarounds for this, depending on the exact circumstances:


The user can create a new trait that forwards to the existing one, and implement it for all desired types.  This is laborious and brittle.


The user can create a wrapper type around the external type and implement the trait on that.  This requires constant wrapping/unwrapping of the value.


The simple solution would be to allow users to override or disable coherence checks.  However, because the "external pair" problem is so easy to come across, this would likely result in a proliferation of incompatible crates; a condition that would be best avoided.
Therefore, this RFC proposes a solution that allows for the implementation of such "external pairs", whilst still allowing coherence to function.
Detailed design

Syntax Changes

The language grammar is adjusted as follows:
impl_item |: "impl" [ '<' generic-list '>' ] type "for" type
             as ident '{' impl '}';

view_item : extern_crate_decl | use_decl | use_impl_decl ';'

use_impl_decl : "use" "impl" path "as" path;
impl Changes

LASTEDIT
Example

Here is an example of implementing Clone on an external type.
// other/lib.rs
pub struct Int32(i32);

impl Int32 {
    pub fn new(v: i32) -> Self { Int32(v) }
    pub fn as_i32(&self) -> i32 { self.0 }
}
// prog/bin.rs
extern crate other;
use other::Int32;

impl Clone for Int32 as Int32Clone {
    fn clone(&self) -> Self {
        Int32::new(self.as_i32())
    }
}

use impl Int32Clone as Clone;

fn main() {
    let _ = Clone::clone(Int32::new(42));
}
The first critical line is the following:
impl Clone for Int32 as Int32Clone;
This defines a "named impl".  Specifically, it defines an implementation of the Clone trait for the type Int32, with the impl itself named Int32Clone.  This is treated as an item in the same way that a trait or struct would be; it is a named thing which can be public, private, used, etc.
This, however, does not immediately provide a Clone implementation for Int32.  For that, the following must be used:
use impl Int32Clone as Clone;
For the sake of discussion, this will be called "activating" the named impl; a use impl item will be called an "impl activation item".
This effectively says the following: "given the impl Int32Clone, use as an implementation for the Clone trait."  With this, all other observable evidence should suggest that Int32 implements Clone.
The syntax form has been chosen to make as explicit as possible what is happening: that a named impl is being introduced and that it is expanding the set of types for which Clone is implemented.
This also serves to discourage large-scale usage of such named impls; each one must be used individually, introducing a not insignificant barrier to use.  It allows them to be used locally to work around missing impls in foreign code, without promoting the existence of "adaptor crates".
Named impls, like regular impls, may also be generic.
Interaction With Coherence

What if a named impl overlaps with an existing impl?
This is a coherence violation.  This RFC specifically does not address the possibility of, or introduce a mechanism for, overriding existing impls.
What if other adds an impl for Clone?
In this case, the use impl Int32Clone as Clone item is violating coherence (there are now two implementations of Clone for a single type), and thus an error.  The mere existence of the named impl, however, would not be a coherence violation.
What if two crates each provide a named impl for a single type?
useing both at the same time would still be fine, but activating both would be a coherence violation.
Interaction With External Non-Generic Code

In cases where there there are no potentially conflicting impls (named or not) in the external crate, there is no interaction.  External code cannot have been written to depend on trait implementations that did not exist when the code was compiled, thus introducing a new named impl for a type cannot possibly have any affect.
In cases where there is a conflicting unnamed impl, this would be a compilation error; it is not possible for one crate to see the unnamed impl, and another to not see it.
In cases where each crate has its own named impl, there is still no interaction.  The interface between the two crates cannot express the existence of a named impl without resorting to generics (covered below).  Thus, the existence of a named impl is a private implementation detail.
This does raise the spectre of "divergent behaviour": for example, two crates might have different implementations of Hash for a type.  However, this is no different to each crate having a private wrapper type that applies custom Hash behaviour, and thus is no more problematic than the current situation.
Interaction With Generics

This design would require all generics to be provided with information regarding what named impls are active for any given type which is being substituted at the instantiation site, and to use them as appropriate.
For example, consider what would happen if code was added to the above example which attempted to use the show function defined below, in another crate:
// show/lib.rs
extern crate other;
use other::Int32;

impl Clone for Int32 as OtherClone {
    fn clone(&self) -> Self {
        Int32::new(self.as_i32())
    }
}

use impl OtherClone as Clone;

pub fn show<T: Clone>(value: &T) {
    println!("{}", value.clone());
}
Although this code may be confusing to programmers, it should not be an issue for the compiler.  That OtherClone is active in the definition context of show is irrelevant: it is only the named impls active in the instantiation context which are important.
That is to say: just as the caller gets to decide what T is, it also gets to decide which named impls to associate with T.
On the other hand, let us define an additional method in show/lib.rs:
pub fn show_int32<T: Into<Int32>>(value: T) {
    let v: Int32 = value.into();
    println!("{}", v.clone());
}
In this case, nothing about T can influence the active named impls on Int32.  As such, show::OtherClone would be used, not prog::Int32Clone.
To expand on this: if OtherClone did not exist or was not active in the definition context of show_int32, then show_int32 would not compile, due to a missing implementation of Clone for Int32.
As such, this design mostly maintains the ability to reason about impls locally, with the exception being generics.
Another thing to consider is what happens when generic types dependent on active named impls are passed into a context where said named impls are not active.  Consider:
mod a {
    use misc::SortedVec;
    use impl misc::FloatTotalOrd as Ord;
    pub fn sorted_floats(fs: &[f32]) -> SortedVec<f32> {
        SortedVec::from_slice(fs)
    }
}
mod b {
    use misc::SortedVec;
    pub fn show_sorted_floats(fs: SortedVec<f32>) {
        for f in fs { println!("f: {:?}", f); }
    }
}
mod misc {
    pub struct SortedVec<T: Ord> { ... }
    impl<T: Ord> SortedVec<T> {
        pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
    }
    /// NaN sorts first.
    pub impl Ord for f32 as FloatTotalOrd { ... }
}
In this case, the code should not compile since SortedVec<f32> involves an invalid substitution within b (which does not have an active impl of Ord for f32).  Let us adjust the code as follows:
mod b {
    use misc::SortedVec;
    use impl misc::RevTotalOrd as Ord;
    pub fn show_sorted_floats(fs: SortedVec<f32>) {
        for f in fs { println!("f: {:?}", f); }
    }
}
mod misc {
    pub struct SortedVec<T: Ord> { ... }
    impl<T: Ord> SortedVec<T> {
        pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
    }
    /// NaN sorts first.
    pub impl Ord for f32 as FloatTotalOrd { ... }
    /// NaN sorts last.
    pub impl Ord for f32 as RevTotalOrd { ... }
}
Now consider the following test function in the root module:
fn test() {
    let sfs = a::sorted_floats(&[3.0, 1.0, 2.0]);
    b::show_sorted_floats(sfs);
}

Aside: Note that because the body of test uses type inference, we do not have to explicitly activate the named impl.  This is consistent with other inference cases.

What should happen?  It would be unacceptable for this code to compile; it would involve the implementation of Ord changing.
In fact, we consider that there are actually two incompatible instances of SortedVec involved in this example.  We can distinguish these by introducing (normative) syntax and annotating the example:
mod a {
    use misc::SortedVec;
    use impl misc::FloatTotalOrd as Ord;
    pub fn sorted_floats(fs: &[f32])
    -> SortedVec<f32 + FloatTotalOrd as Ord> {
        SortedVec::from_slice(fs)
    }
}
mod b {
    use misc::SortedVec;
    use impl misc::RevTotalOrd as Ord;
    pub fn show_sorted_floats(fs: SortedVec<f32 + RevTotalOrd as Ord>) {
        for f in fs { println!("f: {:?}", f); }
    }
}
mod misc { ... }
fn test() {
    ...
}
The two instances of SortedVec are distinct types due to the difference in active named impls.  As such, the above code does not compile.
To prevent named impls from becoming prohibitively "viral", type parameters only track named impls that are needed to satisfy constraints.
A more complicated example is the following (added to the previous example):
mod c {
    use misc::SortedVec;
    pub fn extract<T: Ord>(sv: SortedVec<T>) -> Vec<T> {
        sv.into_vec()
    }
}
mod misc {
    ...
    impl<T: Ord> SortedVec<T> {
        pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
        pub fn into_vec(self) -> Vec<T> { ... }
    }
    ...
}
fn test_2() {
    let sfs = a::sorted_floats(&[3.0, 1.0, 2.0]);
    let sfs = c::extract(sfs);
    show_vec(sfs);

    fn show_vec(fs: Vec<f32>) {
        for f in fs { println!("f: {:?}", f); }
    }
}
Should this code compile?  We previously said that SortedVec<f32> was internally SortedVec<f32 + misc::TotalFloatOrd as Ord>; thus, shouldn't the result of SortedVec::into_vec be Vec<f32 + misc::TotalFloatOrd as Ord>, and thus incompatible with Vec<f32>?
This is why substitutions only track named impls that are relevant to the type parameter.  Vec<T> does not impose a T: Ord constraint and, as such, the named impl is discarded.

LASTEDIT: Wait, so how does this gel with writing fn from_vec(v: Vec<T>) -> SortedVec<T> in the impl of SortedVec?

Interaction With Trait Objects

Because the vtable is carried with the trait object pointer, named impls simply allow code to construct a valid trait object for types where, previously, such was not possible.  Coherence continues to ensure there is no confusion as to which vtable to choose: there is only one option.
However, this only holds at the point where the trait object pointer is created.  Due to the current omission of downcasting from the language at present (specifically, from a trait object back to a concrete type), this should not be problematic (if the effect is even observable).
Drawbacks

(Standard line about increasing complexity of the language.)
This makes it slightly harder to reason about what traits are implemented for a given type.  It is also unclear how rustdoc could be modified to cope with this in a reasonable fashion.  Named impls could be given their own entries, but to be truly useful, they would need to show up on the pages for the respective types and traits involved.
Alternatives

There are a number of minor syntax variations that could be used, including:

impl MyTrait(Trait) for Type { ... }
use MyTrait as impl Trait;
use MyTrait as impl Trait for Type;

Unresolved questions

What parts of the design are still TBD?