Skip to content

Instantly share code, notes, and snippets.

@naftulikay
Last active March 20, 2019 22:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save naftulikay/6c96191526c58ecd3d2dd3344506a88e to your computer and use it in GitHub Desktop.
Save naftulikay/6c96191526c58ecd3d2dd3344506a88e to your computer and use it in GitHub Desktop.
layout title tags
post
Rust: The Hard Parts - Part One
rust, rustlang

Rust has a perception of being a very difficult language to learn. I had a similar experience, but just as I was told, there is a point where things start to get a lot easier. This post aims to describe the hard parts that I had to get through in order to start being productive with Rust in the belief that this may help others get over the hill to that sweet spot of infinite bliss and productivity.

"Nothing's ever easy, is it?"

In this post, I'm going to cover references and borrowing.

Future posts will cover lifetimes, sized and unsized types, and thread safety with Send and Sync.

I'll be providing code examples using the Rust Playground.

References and Borrowing

One of the most notoriously difficult parts about Rust, and indeed one of its selling points, is enforcement of certain rules at compile time relating to how references work and what is allowed and what isn't. I'm explicitly disambiguating this section from lifetimes, which will be covered below. Additionally, I'm going to take some liberty in using words that may have a different definition than what Rust and the community use; this is done with the intention to make concepts easier to understand.

Owned Values

Generally, there are around four different types of references/variable bindings that Rust provides:

  • let a = MyType{}; - the a variable binding is said to own the value that it was assigned to in an immutable way, which means that a cannot be mutated.
  • let mut a = MyType{}; - the a variable binding to an owned mutable value, meaning that things accessing a have the ability to change/mutate the value that a had been assigned.

These bindings are owned, meaning that they are not references in the way that will be described next. a in these examples exclusively owns what it is bound to, which is a new struct called MyType. Any binding that is owned can be moved as opposed to being borrowed.

With the exception of certain pointer types not described here, there can only be one true owner of a value, whereas there can be many references to that value. It's slightly more complicated than this, and we'll deal with the complexity of this shortly.

Unowned, Borrowed Values, AKA References

As opposed to what was defined above as owned values, there are also references to values, which have a different set of rules than owned values above:

  • let b = &a; - the b variable is a non-owned and immutable reference to a.
  • let b = &mut a; - the b variable is a non-owned and mutable reference to a. The a variable binding must be a mut binding in order to borrow a mutably.

References have an additional rule that must be obeyed:

While an owned value pointed to by a exists, there can either be:

  • Many immutable references to a.
  • One mutable reference to a.

If you have active immutable references to a, you cannot have a mutable reference to a.

If you have an active mutable reference to a, you cannot have immutable references to a.

Rust often calls these references "borrowed" values, because ownership does not change when using references: the values are borrowed and returned when they're done being used.

This can be confusing, so let's write some actual code. I'll label things as per those that compile and those that don't. Having described owned values and unowned borrowed references to values, we'll start with references and then move onto owned values.

Immutable References

Let's start with immutable references to values:

Example 00: COMPILES

struct MyType;

fn main() {
    // a _owns_ the value
    let a = MyType{};
    // _b and _c are immutable _references_ to what `a` contains
    let _b = &a;
    let _c = &a;
}

Simply put: b and c are references to the value that a owns.

Example 01: COMPILES

struct MyType;

fn main() {
    // a _owns_ the value
    let a = MyType{};
    // b is an immutable _reference_ to what `a` contains
    let b = &a;
    // c is an immutable _reference_ to what `b` contains, which is a reference to a
    let _c: &b;
}

Note that c is simply a reference to a via b: its actual type is &&MyType, but some magic occurs behind the scenes to make it act like a regular &MyType.

Mutable References

Next, we'll create a mutable reference to a value:

Example 02: COMPILES

struct MyType;

fn main() {
    // a is a mutable, owned value
    let mut a = MyType{};
    // b is a mutable _reference_ to what `a` contains
    let _b = &mut a;
}

Therefore, b is a mutable reference to what a contains.

Just as we described above, there can either be one mutable reference to a value or many immutable references to a value.

Mutable and Immutable References: Fire and Gasoline

Next, let's break things.

Example 03: DOES NOT COMPILE

struct MyType;

fn e(_: &mut MyType) {}
fn f(_: &MyType) {}

fn main() {
    let mut a = MyType{};
    let b = &mut a;
    let c = &a;
    // rustc has gotten smarter so we need to do this to force the compiler to fail
    // call `e` with a mutable reference to `a`, i.e. `b`
    e(b);
    // call `f` with an immutable reference to `a`, i.e. `c`
    f(c);
}

Can you spot why this doesn't compile? It doesn't compile because both a mutable reference to a, which is b, and an immutable reference to a, which is c, both exist at the same time. Try removing one or the other to see it compile. There are obviously two different ways to get it to compile:

  • Remove the definition and use of b.
  • Remove the definition and use of c.

NOTE: Rust has very good reasons for ownership/borrowing rules, namely preventing a number of memory safety and concurrency bugs. I'm not going to describe why Rust does this, as there are plenty of articles out there describing the why and the what.

Okay, now we've covered the rules and seen some code, so let's talk about moving and copying values.

Moving Owned Values

References are easy to acquire and easy to pass around, provided that you follow the rules set out above, but we now need to revisit what owned values are. We used a simplistic definition above which is nevertheless still valid:

There can be only one true owner of a value¹, and zero or more references to that value.²

  • ¹: Usually.
  • ²: Adherent to the rules spelled out above.

So far, we've only seen passing references around, but we haven't yet passed actual values around. Let's do that now.

Example 04: COMPILES

struct MyType;

fn main() {
    let a = MyType{};
    // move `a` into `b`; henceforth, only `b` owns the value and `a` is "destroyed"
    let _b = a;
}

This code moves the value that a owned into b. There are no references involved here. If we try to use a after moving it into b, compilation will fail:

Example 05: DOES NOT COMPILE

struct MyType;

fn f(_: &MyType) {}

fn main() {
    let a = MyType{};
    // move `a`'s value into `b`
    let _b = a;
    // pass a reference to `a` into the `f` function
    f(&a);
}

This does not compile because a, as it were, was "destroyed" by moving its value into b.

Copying Values

There is, however, an exception to this. If a type implements std::marker::Copy, when a move would normally occur, the value is instead copied rather than moved:

Example 06: COMPILES

fn f(_: &u64) {}

fn main() {
    // u64 implements Copy
    let a = 0u64;
    // since u64 implements Copy, it isn't moved here, instead its value is copied into the new binding
    let _b = a;
    // what failed to compile above compiles without issue here because of copying
    f(&a);
}

As the code above illustrates, the previous example that did not compile for MyType does in fact compile for u64 because u64 implements std::marker::Copy. Most primitive types in Rust, like integers and floats, are Copy because the cost of copying such values is extremely low or nonexistent.

The docs describe Copy types as:

Types whose values can be duplicated simply by copying bits.

Modifying Fields and Values

Mutability in Rust generally means that you can:

  1. Modify fields of a struct.
  2. Change a variable binding to point to something else.

Let's demonstrate both in the next example:

Example 07: COMPILES

struct MyType {
    value: u32
}

fn modify_field(i: &mut MyType) {
    i.value = 2;
}

fn modify_ref(i: &mut MyType) {
    *i = MyType { value: 1337 };
}

fn main() {
    // create a mutably owned struct
    let mut a = MyType { value: 1 };
    // pass a mutable reference to modify_field; can only create &mut from a mutable value
    modify_field(&mut a);

    // create a simple integer
    let mut b = MyType { value: 2 };
    // modify b in-place
    modify_ref(&mut b);
}

First, we use modify_field, which takes a mutable reference to a MyType struct, to modify the value field of that struct. Next, we use modify_ref, which again takes a mutable reference to a MyType struct, to change the entire value that b is bound to.

Supercut 😎

We have now covered everything that we can without getting into lifetimes:

  1. Owned Immutable Values
  2. Owned Mutable Values
  3. Immutable References
  4. Mutable References
  5. Moving Values
  6. Copying Values
  7. Modifying Fields and Values via Mutable References

Let's now just run through two examples demonstrating all of the above.

Example 08: COMPILES

struct MyType;

fn mv_immutable(_a: MyType) {
 // within here, `_a` is an immutable value by default
}

fn mv_mutable(mut _a: MyType) {
 // within here, `_a` is a mutable value
}

fn ref_immutable(_: &MyType) {}
fn ref_mutable(_: &mut MyType) {}

fn main() {
    // create owned value and store in `a`
    let a = MyType{};
    // destroy `a` by moving its value into the `mv_immutable` function
    mv_immutable(a);

    // create owned value and store in `b`
    let b = MyType{};
    // destroy `b` by moving its value into the `mv_mutable` function
    mv_mutable(b);

    // create owned value and store in `c`
    let c = MyType{};
    // pass an immutable reference to `c` to the `ref_immutable` function, allowing it to borrow `c`
    ref_immutable(&c);

    // create owned value and store in `d`
    let mut d = MyType{};
    // pass a mutable reference to `d` to the `ref_mutable` function, allowing it to mutably borrow `d`
    ref_mutable(&mut d);
}

You can break this by attempting to access a or b after calling mv_immutable or mv_mutable, as these functions move the values into the functions, destroying the previous variable binding.

It's important to keep our definitions of owned values and references to owned values separate, as seen in this example:

struct MyType;

fn main() {
    let a = MyType;
    let mut a = MyType;
    let _b = &mut a;
}

At first, a is an immutable owned value, then becomes a mutable owned value, and finally b is created as a mutable reference to a. We see above that an immutable owned value can be upgraded to a mutable owned value. With references and borrows, however, it is not possible to upgrade an immutable reference into a mutable reference, at least not safely.

Let's finish up this section by seeing the same thing with struct methods:

Example 09: COMPILES

struct MyType;

impl MyType {
    fn mv_immutable(self) {}
    fn mv_mutable(mut self) {}
    fn ref_immutable(&self) {}
    fn ref_mutable(&mut self) {}
}

fn main() {
    // create owned value and store in `a`
    let a = MyType{};
    // destroy `a` by moving its value into its `mv_immutable` function
    a.mv_immutable();

    // create owned value and store it in `b`
    let b = MyType{};
    // destroy `b` by moving its value mutably into the `mv_mutable` function
    // note that `b` isn't `mut`, it doesn't need to be; since there can only be one owner of a value,
    // we can make the value mutable as it is moved
    b.mv_mutable();

    // create owned value and store in `c`
    let c = MyType{};
    // pass an immutable reference to the its `ref_immutable` function
    c.ref_immutable();

    // create owned value and store it in `d`
    let mut d = MyType{};
    // pass a mutable reference to its `ref_mutable` function
    d.ref_mutable();
}

As above, try violating some of the rules:

  • Using a or b after calling mv_immutable or mv_mutable will break compilation, as they've been destroyed.
  • If d is not declared mut, it will be impossible to call d.ref_mutable().

Alright! We've seen ownership, references, mutability, moving, and copying. Play around with the examples on the playground to see how and why compilation breaks, and especially take note of how awesome rustc is in telling you exactly why things don't work.

Conclusion

I was originally planning on cramming a lot more into this post, but I decided to split things up because there's a LOT of ground to cover.

In summary, here's what you need to know:

  • A value can (usually) only have one owner, not zero, and not more than one.
  • Variable bindings are either mutable or immutable.
  • Moving a value to another owner destroys the original owner.
  • Moving a value to another owner destroys all previous references to that value.
  • If a type is std::marker::Copy, its value is copied in what would normally be a move.
  • A reference, or a "borrow," of a value can be mutable or immutable.
  • There can either be exactly one mutable reference to a value or many immutable references. Never both at the same time.

Next in the series, I will cover lifetimes, sized and unsized types, and thread safety with Send and Sync.

@naftulikay
Copy link
Author

naftulikay commented Mar 20, 2019

TODO

  • Add binding modification *u8_ref = 2; and *struct_ref = MyType{};
  • Update copy: copy types are types whose values can be duplicated simply by copying bits.
  • Example 7: mut means different things.

I will most likely add a couple things:

  • *u8_ref = 2;
  • *struct_ref = MyType{};

@naftulikay
Copy link
Author

Perhaps mention buffer overflows and use-after-free in addition to or in replacement of data races in example 3.

@naftulikay
Copy link
Author

example 07 is a bit tough to describe what the problem is: perhaps the easiest way to say it is that the mut keyword has different meanings depending on context
e.g. aside from the obvious type mismatch, mut a: MyType and a: &mut MyType aren't exactly the same
the first mut is modifying the identifier itself, declaring that it will be mutable in the scope that the variable lives, whereas the second mut is part of the binding's type
so saying that b is destroyed by "moving its value mutably into the mv_mutable function" doesn't sound quite correct; there shouldn't be a distinction between an immutable move and a mutable move
in both cases (immutable and mutable), you are still moving values into their respective functions - the only difference is wheter their respective functions declared them to be mutable parameters
perhaps an example of what i mean would make it clearer: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=62a2ec4c326b94eb3767b97ad1c3b298
the key thing to note here is that you can still assign a to a mutable binding even if it was passed in "immutably"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment