stevedonovan/common-rust-traits.md

## common-rust-traits.md

      
    Raw
  

              common-rust-traits.md
            
          
    What is a Trait?

In Rust, types containing data - structs, enums and any other 'aggregate'
types like tuples and arrays - are dumb. They may have methods but that
is just a convenience; they are just functions. Types have no
relationship with each other.
Traits are the abstract mechanism for adding functionality to types
and establishing relationships between them.
Printing Out: Display

For a value to be printed out using "{}", it must implement the Display trait.
If we're only interested in how a value displays itself, then there are
two ways to define functions taking such values. In this example, we
want to print out slices of references to displayable values.
The first is generic where the element
type of the slice is any type that implements Display:
fn display_items_generic<T: Display> (items: &[&T]) {
    for item in items.iter() {
        println!("{}", item);
    }
}

display_items_generic(&[&10, &20]);
Here the trait Display is acting as a constraint on a generic type.
Separate code is generated for each distinct type T.  There is no
direct analog with mainstream languages here - the closest would be C++
concepts which solves
the "compile-time duck-typing" problem with C++ templates.
The second is polymorphic, where the element type of the slice is a
reference to Display.
fn display_items_polymorphic (items: &[&Display]) {
    for item in items.iter() {
        println!("{}", item);
    }
}

display_items_generic(&[&10, "hello"]);
Code is only generated once for display_items_polymorphic, but we invoke
different code for each type dynamically.  Note that the slice can now contain
references to any value that implements Display.  Here Display is
acting very much like what is called an interface in Java.
The conversion involved is interesting: a reference to a concrete type
becomes a trait object.  It's non-trivial because the trait object
has two parts - the original reference and a 'virtual method table'
containing the methods of the trait (a so-called "fat pointer").
let d: &Display = &10;
(A little too much magic is happening here, and Rust is moving towards a
more explicit notation for trait objects, &dyn Display etc.)
How to decide between generic and polymorphic?  The second is more flexible,
but involves going through virtual methods which is slightly slower.
Generic functions/structs can implement 'zero overhead abstractions'
since the compiler can inline such functions.  The only honest answer is
"it depends". Bear in mind that the actual cost of using trait objects
might be negligible compared to the other work done by a program.  (It's hard
to make engineering decisions based on micro-benchmarks.)
Defining Display for your own types is straightforward but needs to be
explicit, since the compiler cannot reasonably guess what the
output format must be (unlike with Debug)
use std::fmt;

struct MyType {
    x: u32,
    y: u32
}

impl fmt::Display for MyType {
    fn display(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "x={},y={}", self.x, self.y)
    }
}
Any type that implements Display automatically implements ToString, so
42.to_string(), "hello".to_string() all work as expected.
(Rust traits often operate in little groups like this.)
Conversion: From and Into

An important pair of traits is From/Into. The From trait expresses the conversion
of one value into another using the from method. So we have String::from("hello") .
If From is implemented, then the Into trait is auto-implemented.
Since String implements From<&str>, then &str automatically implements Into<String>.
let s = String::from("hello");  // From
let s: String = "hello".into(); // Into
The json crate provides a nice example. A JSON object is indexed with strings,
and new fields can be created by inserting JsonValue values:
obj["surname"] = JsonValue::from("Smith"); // From
obj["name"] = "Joe".into(); // Into
obj["age"] = 35.into(); // Into
Note how convenient it is to use into here, instead of from! We are doing
a conversion which Rust will not do implicitly, but into() is a small word,
easy to type and read.
From expresses a conversion that always succeeds. It may be relatively expensive, though:
converting a string slice to a String will allocate a buffer and copy the bytes. The
conversion always takes place by value.
From/Info has an intimate relationship with Rust error handling.
This statement:
let res = returns_some_result()?;
is basically sugar for this:
let res = match returns_some_result() {
    Ok(r) => r,
    Err(e) => return Err(e.into())
};
That is, any error type which can convert into the returned error type works.
A useful strategy for informal error handling is to make the function return
Result<T,Box<Error>>.  Any type that implements Error can be converted
into the trait object Box<Error>.
Making Copies: Clone and Copy

From (and its mirror image Into) describe how distinct types are converted into
each other. Clone describes how a new value of the same type can be created.
Rust likes to make any potentially expensive operation obvious, so val.clone().
This can simply involve moving some bits around ("bitwise copy").
A number is just a bit pattern in memory.
But String is different, since as well as size and capacity fields,
it has dynamically-allocated string data. To clone a string involves
allocating that buffer and copying the original bytes into it.
Making your types cloneable is easy, as long as every type in a struct or enum
implements Clone:
#[derive(Debug,Clone)]
struct Person {
    first_name: String,
    last_name: String,
}
Copy is a marker trait (there are no methods to implement) which says that
a type may be copied by just moving bits. You can define it for your own
structs:
#[derive(Debug,Clone,Copy)]
struct Point {
    x: f32,
    y: f32,
    z: f32
}
Again, only possible if all types implement Copy. You cannot sneak in a
non-Copy type like String here!
This trait interacts with a key Rust feature: moving. Moving a value is always
done by simply moving bits around.  If the value is Copy, then the original
location remains valid.
let n1 = 42;
let n2 = n1;
// n1 is still fine (i32 is Copy)
let s1 = "hello".to_string();
let s2 = s1;
// value moved into s2, s1 can no longer be used!
Bad things would happen if s1 was still valid - both s1 and s2 would
be dropped at the end of scope and their shared buffer would be deallocated twice!
C++ handles this situation by always copying; in Rust you
must say s1.clone().
Fallible Conversions - FromStr

If I have the integer 42, then it is quite safe to convert this to an owned string,
which is expressed by ToString.  However, if I have the string "42" then in general
the conversion into i32 must be prepared to fail.
To implement FromStr takes two things; an implementation of the from_str method
and setting the associated type Err to the error type returned when the conversion fails.
Usually it's used implicitly through the string parse method. This is a method with
a generic output type, which needs to be tied down.
E.g. using the so-called turbofish operator:
let answer = match "42".parse::<i32>() {
    Ok(n) => n,
    Err(e) => panic!("'42' was not 42!");
};
Or (more elegantly) in a function where we can use ?:
let answer: i32 = "42".parse()?;
The Rust standard library defines FromStr for the numerical types and for network addresses.
It is of course possible for external crates to define FromStr for their types and then
they will work with parse as well.  This is a cool thing about the standard traits - they
are all open for further extension.
Reference Conversions - AsRef

AsRef expresses the situation where a cheap reference conversion is possible
between two types.
The most common place you will see it in action is with &Path. In an ideal world,
all file systems would enforce UTF-8 names and we could just use String to
store them. However, we have not yet arrived at Utopia and Rust has a dedicated
type PathBuf with specialized path handling methods, backed by OsString,
which represents untrusted text from the OS. &Path is the borrowed counterpart
to PathBuf. It is cheap to get a &Path reference from regular Rust strings
so AsRef is appropriate:
// asref.rs
fn exists(p: impl AsRef<Path>) -> bool {
    p.as_ref().exists()
}

assert!(exists("asref.rs"));
assert!(exists(Path::new("asref.rs")));
let ps = String::from("asref.rs");
assert!(exists(&ps));
assert!(exists(PathBuf::from("asref.rs")));
This allows any function or method working with file system paths to be conveniently
called with any type that implements AsRef<Path>.  From the documentation:
impl AsRef<Path> for Path
impl AsRef<Path> for OsStr
impl AsRef<Path> for OsString
impl AsRef<Path> for str
impl AsRef<Path> for String
impl AsRef<Path> for PathBuf
Follow this pattern when defining a public API, because people are accustomed to
this little convenience.
AsRef<str> is implemented for String, so we can also say:
fn is_hello(s: impl AsRef<str>) {
    assert_eq!("hello", s.as_ref());
}

is_hello("hello");
is_hello(String::from("hello"));
This seems attractive, but using this is very much a matter of taste. Idiomatic Rust code
prefers to declare string arguments as &str and lean on deref coercion
for convenient passing of &String references.
Deref

Many string methods in Rust are not actually defined on String. The methods
explicitly defined typically mutate the string, like push and push_str.
But something like starts_with applies to string slices as well.
At one point in Rust's history, this had to be done explicitly, so if you
had a String called s, you would have to say 's.as_str().starts_with("hello"). You will occasionally see as_str()`, but mostly method resolution happens
through the magic of deref coercion.
The Deref trait is actually used to implement the "dereference" operator *.
This has the same meaning as in C - extract the value which the reference is
pointing to - although doesn't appear explicitly as much. If r is a reference,
then you say r.foo(), but if you did want the value, you have to say *r
(In this respect Rust references are more like C pointers than C++ references,
which try to be indistinguishable from C++ values.)
String implements Deref;  the type of &*s is &str.
Deref coercion means that &String will implicitly convert into &str:
let s: String = "hello".into();
let rs: &str = &s;
"Coercion" is a strong word, but this is one of the few places in Rust
where type conversion happens silently. &String is a very
different type to &str! I still remember my
confusion when the compiler insisted that these types were distinct,
especially with operators where the convenience of deref coercion
does not happen.  The match operator matches types explicitly
and this is where s.as_str() is still necessary - &s would not work:
let s = "hello".to_string();
...
match s.as_str() {
    "hello" => {},
    "dolly" => {},
    ....
}

It's idiomatic to use string slices in function arguments, knowing that
&String will convert to &str.
Deref coercion is also used to resolve methods - if the method isn't defined
on String, then we try &str.
A similar relationship holds between Vec<T> and &[T]. Likewise, it's
not idiomatic to have &Vec<T> as a function argument type, since &[T]
is more flexible and &Vec<T> will convert to &[T].
Ownership: Borrow

Ownership is an important concept in Rust; we have types like String that
"own" their data, and types like &str that can "borrow" data from
an owned typed.
The Borrow trait solves a sticky problem with associative maps and sets.
Typically we would keep owned strings in a HashSet to avoid borrowing blues.
But we really don't want to create a String to query set membership!
let mut set = HashSet::new();
set.insert("one".to_string());
// set is now HashSet<String>
if set.contains("two") {
    println!("got two!");
}
The borrowed type &str can be used instead of &String here!
Iteration: Iterator and IntoIterator

The Iterator trait is interesting. You are only required to implement
one method - next() - and all that method must do is return an
Option value each time it's called. When that value is None we
are finished.
However, there are a lot of provided methods which have default
implementations in Iterator. You get map,filter,etc for free.
This is the verbose way to use an iterator:
let mut iter = [10, 20, 30].iter();
while let Some(n) = iter.next() {
    println!("got {}", n);
}
The for statement provides a shortcut:
for n in [10, 20, 30].iter() {
    println!("got {}", n);
}
The expression here actually is anything that can convert into an iterator,
which is expressed by IntoIterator.  So for n in &[10, 20, 30] {...} works
as well - a slice is definitely not an iterator, but it implements
IntoIterator.  Simularly, for i in 0..10 {...} involves a range expression
implicitly converting into an iterator.  Iterators implement IntoIterator
(trivially).
So the for statement in Rust is specifically tied to a single trait.
Iterators in Rust are a zero-overhead abstraction, which means that usually
you do not pay a run-time penalty for using them. In fact, if you wrote out
a loop over slice elements explicitly it would be slower because
of run-time index range checks.
The most general way to pass a sequence of values to a function is
to use IntoIterator. Just using &[T] is too limited and requires the caller
to build up a buffer (which could be both awkward and expensive), Iterator<Item=T>
itself requires caller to call iter() etc.
fn sum (ii: impl IntoIterator<Item=i32>) -> i32 {
    ii.into_iter().sum()
}

println!("{}", sum(0..9));
println!("{}", sum(vec![1,2,3]));
// cloned() here makes an interator over i32 from an interator over &i32
println!("{}", sum([1,2,3].iter().cloned()));
Conclusion: Why are there So Many Ways to Create a String?

let s = "hello".to_string();  // ToString
let s = String::from("hello"); // From
let s: String = "hello".into(); // Into
let s = "hello".to_owned();  // ToOwned
This is a common complaint at first - people like to have one idiomatic way of
doing common operations.  And curiously enough - none of these are actual
String methods!
But all these traits are needed, since they make truly generic programming possible;
when you create strings in code, just pick one way and use it consistently.
A consequence of Rust's dependence on traits is that it can take a while
to learn to read the documentation.
Knowing what methods can be called on a type depends on what traits are implemented for that type.
However, Rust traits are not sneaky. They have to be brought into scope before they
can be used. For instance, you need use std::error::Error before you can
call description() on a type implementing Error.  A lot of types are brought
in by default by the Rust prelude, however.