So you need to read a Rust program and you've never seen rust before?
Rust has a well-earned reputation for being a difficult language because of its strict ownership rules, complex type system, and general assumption that anyone learning it has unlimited time and patience. But there's good news! Just because Rust is hard to write doesn't mean it has to be hard to read.
In fact, once you learn a few basics, you might find that Rust is a fairly easy
language to read. Sure, that Rc<RefCell<Box<dyn Iter<String>>>
might have
taken someone several hours to write, but as a reader you'll be able to glance
at that and figure out enough of what it means to move on in a second or two.
The goal of this document is not to teach you Rust. If you want to learn Rust, you should go read the Rust book which is really good! I'm going to be short on details to try to hit the main things you need to know in order to figure out what a Rust program is doing without really knowing the language at all.
In Rust, the last line of a function is its return value, with no need to
explicitly write return
. You can write out return
if you want, but that's
typically only needed for early returns.
fn load_age(n: i32) -> i32 {
let user = load_user();
user.age
}
Lambda expressions look like this:
[1, 2, 3].map(|x| 2 * x) // -> [2, 4, 6]
or over multiple lines
[1, 2, 3].map(|x| {
let doubled = 2 * x;
doubled * doubled
})
// -> [4, 16, 36]
Again notice the lack of an explicit return
.
Although there's nothing scary about this syntax, one place that can confuse newcomers is when it's used with no arguments:
run_twice(|| {
println!("Hello!")
})
This is the same syntax as before, but can be confusing because the ||
looks
like a boolean-or, but it's actually an empty argument list.
Structs are pretty straightforward:
struct Point {
x: f64,
y: f64,
}
The syntax for adding methods to a struct might be a little unfamiliar, but it's
not too bad. It looks like this, separate from the struct
declaration above:
impl Point {
fn magnitude(&self) -> f64 {
(self.x * self.x + self.y * self.y).sqrt()
}
fn negate(&self) -> Self {
Point { x: -self.x, y: -self.y }
}
}
Enums might surprise you a little bit. Rust enums are really what you would think of as union types or sum types in other languages:
enum WebEvent {
PageLoad,
KeyPress(char),
Paste(String),
ClickAt(i32, i32),
}
Note how the variants can have fields, and different variants can have different numbers and types of fields.
There are two enums in particular that are ubiquitous. First, Option<T>
which
represents either a value of type T
or nothing, analogous to Java's
Optional<T>
:
enum Option<T> {
Some(T),
None,
}
The other is Result<T, E>
which is either a success value of type T
or an
error of type E
.
enum Result<T, E> {
Ok(T),
Err(E),
}
Rust doesn't have exceptions, so functions that can fail will declare their
return type as a Result
. More on this in a bit.
This may be a new concept for you depending on what languages you've seen before. It may also be recognizable to you as a much less painful version of the visitor pattern.
Pattern matching is used when you want logic that branches on which variant of
an enum you have. You can recognize it by the match
keyword followed by a
number of branches for each of the enum variants. For example,
let possible_user: Option<User> = maybe_get_user();
let first_name = match possible_user {
Some(user) => user.first_name,
None => "<unknown>"
}
In this example, that first branch, Some(user) => user.first_name
, means:
"If
possible_user
, which is of typeOption<User>
, is in theSome
variant, then assign the value contained inside it to the variable nameduser
and then evaluateuser.first_name
."
Similarly, the second branch None => "<unknown>"
means:
"If
possible_user
is in theNone
variant, then evaluate"<unknown>"
."
In either case, the result is then assigned to first_name
.
Pattern matching is awesome. It's really too bad that not all languages have it.
We mentioned above that Result<T, E>
, representing either a success of type
T
or a failure of type E
, is used as a return type for functions that might
fail. This means we might find ourselves writing code that looks like this:
// This is overly verbose. Better version later.
fn load_nfts_owned_by(user_id: i32) -> Result<Vec<Nft>, NftError> {
let maybe_user = load_user(user_id);
let user = match maybe_user {
Ok(u) => u,
Err(err) => return Err(err),
};
let maybe_nft_ids = load_nft_ids(user);
let nft_ids = match maybe_nft_ids {
Ok(ids) => ids,
Err(err) => return Err(err),
};
load_nfts(nft_ids)
}
The idea here is that each of the intermediate steps load_user
and
load_nft_ids
can fail and so is returning an intermediate Result
. For each
of these intermediate results, if it was a failure, we want to return that
failure immediately, while if it was a success we want to unwrap the success
value and continue. This might feel vaguely familiar to Go users.
This pattern is so common that there is a shorthand to write the same thing much more briefly:
fn load_nfts_owned_by(user_id: i32) -> Result<Vec<Nft>, NftError> {
let user = load_user(user_id)?;
let nft_ids = load_nft_ids(user)?;
load_nfts(nft_ids)
}
This version is exactly equivalent to the one above. Note the ?
syntax. In
general, if result
is a Result<T, E>
, then result?
has type T
and means
"If this
Result
is an error, then return it immediately. Otherwise, this evaluates to the success value and we continue."
This means that when skimming code, you can look for ?
to spot the operations
that can fail, which tend to be the "important" ones.
One of the things that makes Rust hard to learn is understanding memory safety and references. You'll see lots of different types that represent different flavors of references, like:
String
mut String
&String
&mut String
Box<String>
RefCell<String>
Rc<String>
Arc<String>
Cow<String>
impl AsRef<String>
My advice for a reader is to just not worry about these. Just think of all of
these as a string and move on. In languages like Java or TypeScript, most of
these would just be String
.
Traits are Rust's version of interfaces. The syntax to declare a trait looks like this:
trait Named {
fn get_name(&self) -> String;
}
and to declare a struct that implements it:
struct User {
name: String,
age: i32,
}
impl Named for User {
fn get_name(&self) -> String {
self.name
}
}
Other than the syntax, there shouldn't be anything too surprising there. If you just think of traits as being the same as Java's interfaces, you'll be fine, although traits are actually more powerful in a couple of ways. Here's an example of one:
trait JsonSerializable {
fn from_json(json: String) -> Self;
fn to_json(&self) -> String;
}
Notice how the first function on this trait does not take self
as an input and
thus does not require that you already have an instance to call it. This allows
this trait to specify that implementing types have functions to convert the type
to JSON and back, which cannot be expressed with Java interfaces. In Java terms,
we can think of this as an interface that enforces that implementing classes
provide certain static methods.
Once traits start showing up, you'll start to see generics show up too. Unlike
interfaces in other languages, we can't just use the name of a trait in the same
places we would use a concrete type. So for example if we have a trait called
Yodeler
, then we can't write this:
// Not allowed
fn perform_yodel_show(yodeler: Yodeler) {
// ...
}
Instead, we need to declare a generic type parameter and give it a trait bound. We can do so in any of the following ways, which are all equivalent:
fn perform_yodel_show<T: Yodeler>(yodeler: T) {
// ...
}
fn perform_yodel_show<T>(yodeler: T)
where T: Yodeler
{
// ...
}
// This form is a shorthand that can't be used for more complex examples.
fn perform_yodel_show(yodeler: impl Yodeler) {
// ...
}
For your purposes while reading, you can mentally replace any of these with the "not allowed" example from just above to match the way it would appear in other languages. The reason for this syntax has to do with Rust attempting to squeeze out every last drop of performance by avoiding dynamic dispatch unless explicitly requested. But again, as a reader you do not need to worry about that.
Rust has macros, which can programmatically generate code during compilation.
You can tell that something is a macro because its name will end with !
. A
common one is the println formatter:
println!("Hello, {}. Your score is {}", name, score);
Because println!
is a macro, it can take variadic arguments and validate at
compile time that the number of arguments matches the number of format
placeholders {}
.
You don't need to know much about macros other than that they exist, and that if
you see something that looks like a function but whose name ends with !
, then
it's allowed to break the normal Rust rules with regards to syntax.
Sometimes you'll see things that look like type parameters, but lowercase and
prefixed with a '
. For example:
fn find_substr<'a, 'b>(haystack: &'a str, needle: &'b str) -> &'a str {
// ...
}
If you see these, just pretend they don't exist. These are lifetime annotations, which inform the compiler how long it can assume these references point to valid memory. Nothing that concerns you as a reader, so act like you saw nothing.
Well okay. Let's say you wanted to learn something from the lifetime
annotations. In the example above, we see that the returned string has the same
lifetime as the first argument but not the second. This implies that the
returned string is a reference into some part of the first argument. For a
function with a name like find_substr
, this probably means that the returned
value is some particular substring inside the first argument. You can actually
get a lot of information from lifetime annotations, but to do so you need a bit
of knowledge of Rust's ownership model.