Skip to content

Instantly share code, notes, and snippets.

@DanielKeep
Last active November 28, 2023 17:51
Show Gist options
  • Star 76 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save DanielKeep/470f4e114d28cd0c8d43 to your computer and use it in GitHub Desktop.
Save DanielKeep/470f4e114d28cd0c8d43 to your computer and use it in GitHub Desktop.
Rust's Modules are Weird

Note: this was written for rustdoc in order to take advantage of its ability to test code blocks. Unfortunately, it doesn't provide a way to test that a code block fails to compile (so all the "invalid" code blocks are ignore).

As a result, the code blocks were tested manually in external files. I've tried to make sure everything is correct, but cannot eliminate the possibility that the code will get out of sync with the prose. If you find any discrepancies, please let me know!

The fn main() that appears in tests is also to keep the rustdoc happy.

Well, they are. Whilst there isn't anything really new about Rust's module system, it still managed to leave me confused and lost in the woods (metaphorically). This is a record of my current understanding in the hopes that it will help someone stuck in a similar position.

The way I think of modules in most cases is that each file is a neat little box all on its lonesome. If you want to talk about something, you have to bring it into that box, usually by importing it from another box. Mass imports aside, if you read a module's source file from top to bottom, you know about everything that's contained in the module.

The major problem with something like C's #include is that, aside from increasing compile times, they are hideously leaky. Not only do you have zero control over what you export from a header file, not only are they impossible to contain in general, you can't read them to see what they themselves can access; they can access anything that's previously been included by anything else. This makes attempting to bind, say, the Windows API, a complete and utter nightmare where one header file depends on the contents of another which it doesn't include, but which happens to be included by one of its users, but not others. Yeesh.

In a bizzare way, Rust's module system feels closer to #include than other module systems I've encountered. Modules in Rust are both neat little boxes and header files pretending to be neat little boxes. Allow me to elucidate.

The first point in my mental model of Rust's module system is this:

#1: A crate stems from a single root module.

(As a quick aside on "crate": if you haven't already internalised this, a crate is simply a Rust module which can be linked to by other crates. For example, there's the libstd library which is a crate that is the std module.)

This is different from, say, D where you can combine arbitrarily many modules in addition to the one containing your main function, with no requirement for them to even be visible from the "main module". With Rust, the root module is contained in whatever source file you hand to rustc; in the case of Cargo packages, src/lib.rs and each file in src/bin are root modules.

Since there is only one root module, every other module has be a descendant of it. This brings in the second bit of slight skew from expectations. In other module systems, you specify which modules you want to access by asking the system to find them for you. For example, in Python:

# I want to access the module "stuff" which exists... somewhere...
import stuff

# The contents are... also somewhere...

This is not how Rust works. In Rust, you don't do this; you don't import a module, you define it.

// The "stuff" module exists **here** and **nowhere else**.
mod stuff {
	// And here's what it contains...
	pub fn blah() {}
}

fn main() {}

It's an incredibly important distinction. mod isn't a statement, it's a declaration. It's like struct or enum or fn: you aren't just introducing the name, you're also specifying the contents. Now, for the sake of sanity, you can also do this:

// src/lib.rs
// The "stuff" module exists **here** and **nowhere else**.
mod stuff;
// src/stuff.rs
// And here's what it contains...
pub fn blah() {}

These two are not just similar, they are (modulo byte-level representation) semantically identical. Thus, the second point is this:

#2: A crate's source files are just an organisational convenience for dividing up the root module's source file; nothing more.

This also leads to:

#2a: A module does not exist alone, it exists in the context of the root module.

Which is where this example comes in:

fn f1() {}

mod a {
	fn f2() { f1() }
}

fn main() {}

Normally, I would expect this to work because f1 and f2 are in the same file, and f1 is at a wider scope than f2 is. But it doesn't. It fails with:

<anon>:8:19: 8:21 error: unresolved name `f1`.
<anon>:8         fn f2() { f1() }
                           ^~

This doesn't really make any sense until you realise that the above example is exactly equivalent to this:

// src/lib.rs
fn f1() {}

mod a;
// src/a.rs

fn f2() { f1() }

This looks like it should fail because there's nothing in a to make f1 visible. And that leads us to:

#2b: Each module gets its own, independent symbol table, even if they're in the same file.

So although multi-file crates are really just a single, giant file in disguise (like with #includes), the scoping rules work as you would expect from a normal "neat little boxes" model. The solution to this is:

pub fn f1() {}

mod a {
	use f1;
	fn f2() { f1() }
}

fn main() {}

Or, alternately, you can do this:

pub fn f1() {}

mod a {
	fn f2() { ::f1() }
}

fn main() {}

But you cannot do this:

pub fn f1() {}

mod a {
	use ::f1;
	fn f2() { f1() }
	//        ^~
	// error: unresolved name `f1`.
}

fn main() {}

Wait, what? This brings me to the next point:

#3: There are two kinds of paths in Rust: use paths and non-use paths.

  • use paths are relative to the root module.
  • non-use paths are relative to the containing module.

(Though use-less paths is funnier.)

As a consequence of this, how you refer to a given item changes depending on whether you're use-ing it or not. As a quick summary:

  • For use paths:

    • By default, paths are relative to the root module of the crate.
    • Paths cannot begin with ::.
  • For non-use paths:

    • By default, paths are relative to the current module.
    • Paths that begin with :: are relative to the root module of the crate.
  • For both types:

    • Paths that begin with self:: are relative to the current module.
    • Paths that begin with super:: are relative to the parent of the current module.

This explains why, for example, you can have a use libc::c_void; statement work in any module if you have extern crate libc; in the root, but specifying a return type as libc::c_void only works in modules that contain an actual extern crate libc; statement.

TL;DR

  1. A crate stems from a single root module.

  2. A crate's source files are just an organisational convenience for dividing up the root module's source file; nothing more.

    a. A module does not exist alone, it exists in the context of the root module.

    b. Each module gets its own, independent symbol table, even if they're in the same file.

  3. There are two kinds of paths in Rust: use paths and non-use paths. The former are relative to the root module, the latter to the current module.

All of the following paths (save those commented out) work:

pub fn f1() {}

mod a {
	use f1;
	use b::f4;
	pub fn f2() { f1() }
	fn f3() { f4() }
}

mod b {
	pub fn f4() { ::f1() }
	fn f5() { super::f1() }
	fn f6() { ::a::f2() }
	fn f7() { super::a::f2() }
	fn f8() { self::f4() }
	fn f9() { self::f5() }
}

fn f10() {
	a::f2();
	self::a::f2();
	// a::f3(); // error: function `f3` is private
	b::f4();
	// b::f5(); // error: function `f5` is private
}

fn main() {}
@melston
Copy link

melston commented Feb 29, 2016

I know this is older now but I just ran across it as I am just starting to look at Rust. I found a description of the module system that left me completely baffled and started searching for a better explanation and came across this. It has helped a lot.

There is one thing I would like to point out, however, that I think is a bit misleading in your description under #1. Specifically you point to the differences between something like 'import' in Python and 'mod' in Rust. This, I think, is comparing apples and oranges.

The problem is that import doesn't define anything. It makes something that has already been defined available in the current environment. It is much closer to 'extern crate' than it is to 'mod'. In this sense a Python package (which can be imported) is very close to a Rust crate.

So, either I have entirely missed your point (which is possible) or your example doesn't really show what you would like it to show.

Mark

@turnage
Copy link

turnage commented Nov 17, 2016

Thank you for making this post!

@pballandras
Copy link

This might be the best thing I saw in 2020

@HarrisonHemstreet
Copy link

great job

@rebekah
Copy link

rebekah commented Nov 28, 2023

I ran across this early in my Rust adventure, and glad I did... I had no idea regarding these things. I expect it will make my journey much easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment