- Start Date: 2015-06-14
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
- TODO: Clarify why we're doing this.
- TODO: Why not a private impl? Make sure to emphasise readability concerns.
- TODO: impls can be anywhere in local crates; damn.
This RFC proposes introducing "named impl
s" for the purpose of allowing external traits to be implemented for external types without violating coherence.
Rust relies heavily on traits as a basic unit of abstraction. However, the current coherence rules introduce an extremely painful problem: external types failing to implement external traits.
This can take a number of forms. There are two major classes, though:
-
A type failing to implement a trait which could have reasonably been implemented, but was not.
-
A type not implementing a trait which it could not know about, but which the user of the type is powerless to implement themselves due to the trait also being external.
There are two workarounds for this, depending on the exact circumstances:
-
The user can create a new trait that forwards to the existing one, and implement it for all desired types. This is laborious and brittle.
-
The user can create a wrapper type around the external type and implement the trait on that. This requires constant wrapping/unwrapping of the value.
The simple solution would be to allow users to override or disable coherence checks. However, because the "external pair" problem is so easy to come across, this would likely result in a proliferation of incompatible crates; a condition that would be best avoided.
Therefore, this RFC proposes a solution that allows for the implementation of such "external pairs", whilst still allowing coherence to function.
The language grammar is adjusted as follows:
impl_item |: "impl" [ '<' generic-list '>' ] type "for" type
as ident '{' impl '}';
view_item : extern_crate_decl | use_decl | use_impl_decl ';'
use_impl_decl : "use" "impl" path "as" path;
LASTEDIT
Here is an example of implementing Clone
on an external type.
// other/lib.rs
pub struct Int32(i32);
impl Int32 {
pub fn new(v: i32) -> Self { Int32(v) }
pub fn as_i32(&self) -> i32 { self.0 }
}
// prog/bin.rs
extern crate other;
use other::Int32;
impl Clone for Int32 as Int32Clone {
fn clone(&self) -> Self {
Int32::new(self.as_i32())
}
}
use impl Int32Clone as Clone;
fn main() {
let _ = Clone::clone(Int32::new(42));
}
The first critical line is the following:
impl Clone for Int32 as Int32Clone;
This defines a "named impl
". Specifically, it defines an implementation of the Clone
trait for the type Int32
, with the impl
itself named Int32Clone
. This is treated as an item in the same way that a trait
or struct
would be; it is a named thing which can be public, private, use
d, etc.
This, however, does not immediately provide a Clone
implementation for Int32
. For that, the following must be used:
use impl Int32Clone as Clone;
For the sake of discussion, this will be called "activating" the named impl
; a use impl
item will be called an "impl
activation item".
This effectively says the following: "given the impl
Int32Clone
, use as an implementation for the Clone
trait." With this, all other observable evidence should suggest that Int32
implements Clone
.
The syntax form has been chosen to make as explicit as possible what is happening: that a named impl
is being introduced and that it is expanding the set of types for which Clone
is implemented.
This also serves to discourage large-scale usage of such named impl
s; each one must be use
d individually, introducing a not insignificant barrier to use. It allows them to be used locally to work around missing impl
s in foreign code, without promoting the existence of "adaptor crates".
Named impl
s, like regular impl
s, may also be generic.
What if a named impl
overlaps with an existing impl
?
This is a coherence violation. This RFC specifically does not address the possibility of, or introduce a mechanism for, overriding existing impl
s.
What if other
adds an impl
for Clone
?
In this case, the use impl Int32Clone as Clone
item is violating coherence (there are now two implementations of Clone
for a single type), and thus an error. The mere existence of the named impl
, however, would not be a coherence violation.
What if two crates each provide a named impl
for a single type?
use
ing both at the same time would still be fine, but activating both would be a coherence violation.
In cases where there there are no potentially conflicting impl
s (named or not) in the external crate, there is no interaction. External code cannot have been written to depend on trait implementations that did not exist when the code was compiled, thus introducing a new named impl
for a type cannot possibly have any affect.
In cases where there is a conflicting unnamed impl
, this would be a compilation error; it is not possible for one crate to see the unnamed impl
, and another to not see it.
In cases where each crate has its own named impl
, there is still no interaction. The interface between the two crates cannot express the existence of a named impl
without resorting to generics (covered below). Thus, the existence of a named impl
is a private implementation detail.
This does raise the spectre of "divergent behaviour": for example, two crates might have different implementations of Hash
for a type. However, this is no different to each crate having a private wrapper type that applies custom Hash
behaviour, and thus is no more problematic than the current situation.
This design would require all generics to be provided with information regarding what named impl
s are active for any given type which is being substituted at the instantiation site, and to use them as appropriate.
For example, consider what would happen if code was added to the above example which attempted to use the show
function defined below, in another crate:
// show/lib.rs
extern crate other;
use other::Int32;
impl Clone for Int32 as OtherClone {
fn clone(&self) -> Self {
Int32::new(self.as_i32())
}
}
use impl OtherClone as Clone;
pub fn show<T: Clone>(value: &T) {
println!("{}", value.clone());
}
Although this code may be confusing to programmers, it should not be an issue for the compiler. That OtherClone
is active in the definition context of show
is irrelevant: it is only the named impl
s active in the instantiation context which are important.
That is to say: just as the caller gets to decide what T
is, it also gets to decide which named impl
s to associate with T
.
On the other hand, let us define an additional method in show/lib.rs
:
pub fn show_int32<T: Into<Int32>>(value: T) {
let v: Int32 = value.into();
println!("{}", v.clone());
}
In this case, nothing about T
can influence the active named impl
s on Int32
. As such, show::OtherClone
would be used, not prog::Int32Clone
.
To expand on this: if OtherClone
did not exist or was not active in the definition context of show_int32
, then show_int32
would not compile, due to a missing implementation of Clone
for Int32
.
As such, this design mostly maintains the ability to reason about impl
s locally, with the exception being generics.
Another thing to consider is what happens when generic types dependent on active named impl
s are passed into a context where said named impl
s are not active. Consider:
mod a {
use misc::SortedVec;
use impl misc::FloatTotalOrd as Ord;
pub fn sorted_floats(fs: &[f32]) -> SortedVec<f32> {
SortedVec::from_slice(fs)
}
}
mod b {
use misc::SortedVec;
pub fn show_sorted_floats(fs: SortedVec<f32>) {
for f in fs { println!("f: {:?}", f); }
}
}
mod misc {
pub struct SortedVec<T: Ord> { ... }
impl<T: Ord> SortedVec<T> {
pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
}
/// NaN sorts first.
pub impl Ord for f32 as FloatTotalOrd { ... }
}
In this case, the code should not compile since SortedVec<f32>
involves an invalid substitution within b
(which does not have an active impl
of Ord
for f32
). Let us adjust the code as follows:
mod b {
use misc::SortedVec;
use impl misc::RevTotalOrd as Ord;
pub fn show_sorted_floats(fs: SortedVec<f32>) {
for f in fs { println!("f: {:?}", f); }
}
}
mod misc {
pub struct SortedVec<T: Ord> { ... }
impl<T: Ord> SortedVec<T> {
pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
}
/// NaN sorts first.
pub impl Ord for f32 as FloatTotalOrd { ... }
/// NaN sorts last.
pub impl Ord for f32 as RevTotalOrd { ... }
}
Now consider the following test function in the root module:
fn test() {
let sfs = a::sorted_floats(&[3.0, 1.0, 2.0]);
b::show_sorted_floats(sfs);
}
Aside: Note that because the body of
test
uses type inference, we do not have to explicitly activate the namedimpl
. This is consistent with other inference cases.
What should happen? It would be unacceptable for this code to compile; it would involve the implementation of Ord
changing.
In fact, we consider that there are actually two incompatible instances of SortedVec
involved in this example. We can distinguish these by introducing (normative) syntax and annotating the example:
mod a {
use misc::SortedVec;
use impl misc::FloatTotalOrd as Ord;
pub fn sorted_floats(fs: &[f32])
-> SortedVec<f32 + FloatTotalOrd as Ord> {
SortedVec::from_slice(fs)
}
}
mod b {
use misc::SortedVec;
use impl misc::RevTotalOrd as Ord;
pub fn show_sorted_floats(fs: SortedVec<f32 + RevTotalOrd as Ord>) {
for f in fs { println!("f: {:?}", f); }
}
}
mod misc { ... }
fn test() {
...
}
The two instances of SortedVec
are distinct types due to the difference in active named impl
s. As such, the above code does not compile.
To prevent named impl
s from becoming prohibitively "viral", type parameters only track named impl
s that are needed to satisfy constraints.
A more complicated example is the following (added to the previous example):
mod c {
use misc::SortedVec;
pub fn extract<T: Ord>(sv: SortedVec<T>) -> Vec<T> {
sv.into_vec()
}
}
mod misc {
...
impl<T: Ord> SortedVec<T> {
pub fn from_slice(s: &[T]) -> SortedVec<T> { ... }
pub fn into_vec(self) -> Vec<T> { ... }
}
...
}
fn test_2() {
let sfs = a::sorted_floats(&[3.0, 1.0, 2.0]);
let sfs = c::extract(sfs);
show_vec(sfs);
fn show_vec(fs: Vec<f32>) {
for f in fs { println!("f: {:?}", f); }
}
}
Should this code compile? We previously said that SortedVec<f32>
was internally SortedVec<f32 + misc::TotalFloatOrd as Ord>
; thus, shouldn't the result of SortedVec::into_vec
be Vec<f32 + misc::TotalFloatOrd as Ord>
, and thus incompatible with Vec<f32>
?
This is why substitutions only track named impl
s that are relevant to the type parameter. Vec<T>
does not impose a T: Ord
constraint and, as such, the named impl
is discarded.
LASTEDIT: Wait, so how does this gel with writing
fn from_vec(v: Vec<T>) -> SortedVec<T>
in theimpl
ofSortedVec
?
Because the vtable is carried with the trait object pointer, named impl
s simply allow code to construct a valid trait object for types where, previously, such was not possible. Coherence continues to ensure there is no confusion as to which vtable to choose: there is only one option.
However, this only holds at the point where the trait object pointer is created. Due to the current omission of downcasting from the language at present (specifically, from a trait object back to a concrete type), this should not be problematic (if the effect is even observable).
(Standard line about increasing complexity of the language.)
This makes it slightly harder to reason about what traits are implemented for a given type. It is also unclear how rustdoc
could be modified to cope with this in a reasonable fashion. Named impl
s could be given their own entries, but to be truly useful, they would need to show up on the pages for the respective types and traits involved.
There are a number of minor syntax variations that could be used, including:
impl MyTrait(Trait) for Type { ... }
use MyTrait as impl Trait;
use MyTrait as impl Trait for Type;
What parts of the design are still TBD?