Skip to content

Instantly share code, notes, and snippets.

@dphilipson
Last active March 8, 2022 02:26
Show Gist options
  • Save dphilipson/cfb41698cc1f7cba4db2baef60511834 to your computer and use it in GitHub Desktop.
Save dphilipson/cfb41698cc1f7cba4db2baef60511834 to your computer and use it in GitHub Desktop.
How to Read Rust

How to Read Rust without Knowing Rust

So you need to read a Rust program and you've never seen rust before?

Rust has a well-earned reputation for being a difficult language because of its strict ownership rules, complex type system, and general assumption that anyone learning it has unlimited time and patience. But there's good news! Just because Rust is hard to write doesn't mean it has to be hard to read.

In fact, once you learn a few basics, you might find that Rust is a fairly easy language to read. Sure, that Rc<RefCell<Box<dyn Iter<String>>> might have taken someone several hours to write, but as a reader you'll be able to glance at that and figure out enough of what it means to move on in a second or two.

The goal of this document is not to teach you Rust. If you want to learn Rust, you should go read the Rust book which is really good! I'm going to be short on details to try to hit the main things you need to know in order to figure out what a Rust program is doing without really knowing the language at all.

Return values

In Rust, the last line of a function is its return value, with no need to explicitly write return. You can write out return if you want, but that's typically only needed for early returns.

fn load_age(n: i32) -> i32 {
    let user = load_user();
    user.age
}

Lambdas

Lambda expressions look like this:

[1, 2, 3].map(|x| 2 * x)  // -> [2, 4, 6]

or over multiple lines

[1, 2, 3].map(|x| {
    let doubled = 2 * x;
    doubled * doubled
})
// -> [4, 16, 36]

Again notice the lack of an explicit return.

Although there's nothing scary about this syntax, one place that can confuse newcomers is when it's used with no arguments:

run_twice(|| {
    println!("Hello!")
})

This is the same syntax as before, but can be confusing because the || looks like a boolean-or, but it's actually an empty argument list.

Structs

Structs are pretty straightforward:

struct Point {
    x: f64,
    y: f64,
}

The syntax for adding methods to a struct might be a little unfamiliar, but it's not too bad. It looks like this, separate from the struct declaration above:

impl Point {
    fn magnitude(&self) -> f64 {
      (self.x * self.x + self.y * self.y).sqrt()
    }

    fn negate(&self) -> Self {
      Point { x: -self.x, y: -self.y }
    }
}

Enums

Enums might surprise you a little bit. Rust enums are really what you would think of as union types or sum types in other languages:

enum WebEvent {
    PageLoad,
    KeyPress(char),
    Paste(String),
    ClickAt(i32, i32),
}

Note how the variants can have fields, and different variants can have different numbers and types of fields.

There are two enums in particular that are ubiquitous. First, Option<T> which represents either a value of type T or nothing, analogous to Java's Optional<T>:

enum Option<T> {
    Some(T),
    None,
}

The other is Result<T, E> which is either a success value of type T or an error of type E.

enum Result<T, E> {
    Ok(T),
    Err(E),
}

Rust doesn't have exceptions, so functions that can fail will declare their return type as a Result. More on this in a bit.

Pattern matching

This may be a new concept for you depending on what languages you've seen before. It may also be recognizable to you as a much less painful version of the visitor pattern.

Pattern matching is used when you want logic that branches on which variant of an enum you have. You can recognize it by the match keyword followed by a number of branches for each of the enum variants. For example,

let possible_user: Option<User> = maybe_get_user();
let first_name = match possible_user {
    Some(user) => user.first_name,
    None => "<unknown>"
}

In this example, that first branch, Some(user) => user.first_name, means:

"If possible_user, which is of type Option<User>, is in the Some variant, then assign the value contained inside it to the variable named user and then evaluate user.first_name."

Similarly, the second branch None => "<unknown>" means:

"If possible_user is in the None variant, then evaluate "<unknown>"."

In either case, the result is then assigned to first_name.

Pattern matching is awesome. It's really too bad that not all languages have it.

Error handling with ?

We mentioned above that Result<T, E>, representing either a success of type T or a failure of type E, is used as a return type for functions that might fail. This means we might find ourselves writing code that looks like this:

// This is overly verbose. Better version later.
fn load_nfts_owned_by(user_id: i32) -> Result<Vec<Nft>, NftError> {
    let maybe_user = load_user(user_id);
    let user = match maybe_user {
        Ok(u) => u,
        Err(err) => return Err(err),
    };
    let maybe_nft_ids = load_nft_ids(user);
    let nft_ids = match maybe_nft_ids {
        Ok(ids) => ids,
        Err(err) => return Err(err),
    };
    load_nfts(nft_ids)
}

The idea here is that each of the intermediate steps load_user and load_nft_ids can fail and so is returning an intermediate Result. For each of these intermediate results, if it was a failure, we want to return that failure immediately, while if it was a success we want to unwrap the success value and continue. This might feel vaguely familiar to Go users.

This pattern is so common that there is a shorthand to write the same thing much more briefly:

fn load_nfts_owned_by(user_id: i32) -> Result<Vec<Nft>, NftError> {
    let user = load_user(user_id)?;
    let nft_ids = load_nft_ids(user)?;
    load_nfts(nft_ids)
}

This version is exactly equivalent to the one above. Note the ? syntax. In general, if result is a Result<T, E>, then result? has type T and means

"If this Result is an error, then return it immediately. Otherwise, this evaluates to the success value and we continue."

This means that when skimming code, you can look for ? to spot the operations that can fail, which tend to be the "important" ones.

References

One of the things that makes Rust hard to learn is understanding memory safety and references. You'll see lots of different types that represent different flavors of references, like:

  • String
  • mut String
  • &String
  • &mut String
  • Box<String>
  • RefCell<String>
  • Rc<String>
  • Arc<String>
  • Cow<String>
  • impl AsRef<String>

My advice for a reader is to just not worry about these. Just think of all of these as a string and move on. In languages like Java or TypeScript, most of these would just be String.

Traits

Traits are Rust's version of interfaces. The syntax to declare a trait looks like this:

trait Named {
    fn get_name(&self) -> String;
}

and to declare a struct that implements it:

struct User {
    name: String,
    age: i32,
}

impl Named for User {
    fn get_name(&self) -> String {
        self.name
    }
}

Other than the syntax, there shouldn't be anything too surprising there. If you just think of traits as being the same as Java's interfaces, you'll be fine, although traits are actually more powerful in a couple of ways. Here's an example of one:

trait JsonSerializable {
    fn from_json(json: String) -> Self;
    fn to_json(&self) -> String;
}

Notice how the first function on this trait does not take self as an input and thus does not require that you already have an instance to call it. This allows this trait to specify that implementing types have functions to convert the type to JSON and back, which cannot be expressed with Java interfaces. In Java terms, we can think of this as an interface that enforces that implementing classes provide certain static methods.

Traits and generics

Once traits start showing up, you'll start to see generics show up too. Unlike interfaces in other languages, we can't just use the name of a trait in the same places we would use a concrete type. So for example if we have a trait called Yodeler, then we can't write this:

// Not allowed
fn perform_yodel_show(yodeler: Yodeler) {
    // ...
}

Instead, we need to declare a generic type parameter and give it a trait bound. We can do so in any of the following ways, which are all equivalent:

fn perform_yodel_show<T: Yodeler>(yodeler: T) {
    // ...
}
fn perform_yodel_show<T>(yodeler: T)
where T: Yodeler
{
    // ...
}
// This form is a shorthand that can't be used for more complex examples.
fn perform_yodel_show(yodeler: impl Yodeler) {
    // ...
}

For your purposes while reading, you can mentally replace any of these with the "not allowed" example from just above to match the way it would appear in other languages. The reason for this syntax has to do with Rust attempting to squeeze out every last drop of performance by avoiding dynamic dispatch unless explicitly requested. But again, as a reader you do not need to worry about that.

Macros

Rust has macros, which can programmatically generate code during compilation. You can tell that something is a macro because its name will end with !. A common one is the println formatter:

println!("Hello, {}. Your score is {}", name, score);

Because println! is a macro, it can take variadic arguments and validate at compile time that the number of arguments matches the number of format placeholders {}.

You don't need to know much about macros other than that they exist, and that if you see something that looks like a function but whose name ends with !, then it's allowed to break the normal Rust rules with regards to syntax.

Lifetimes

Sometimes you'll see things that look like type parameters, but lowercase and prefixed with a '. For example:

fn find_substr<'a, 'b>(haystack: &'a str, needle: &'b str) -> &'a str {
    // ...
}

If you see these, just pretend they don't exist. These are lifetime annotations, which inform the compiler how long it can assume these references point to valid memory. Nothing that concerns you as a reader, so act like you saw nothing.

Well okay. Let's say you wanted to learn something from the lifetime annotations. In the example above, we see that the returned string has the same lifetime as the first argument but not the second. This implies that the returned string is a reference into some part of the first argument. For a function with a name like find_substr, this probably means that the returned value is some particular substring inside the first argument. You can actually get a lot of information from lifetime annotations, but to do so you need a bit of knowledge of Rust's ownership model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment