Note: this was written for
rustdoc
in order to take advantage of its ability to test code blocks. Unfortunately, it doesn't provide a way to test that a code block fails to compile (so all the "invalid" code blocks are ignore).As a result, the code blocks were tested manually in external files. I've tried to make sure everything is correct, but cannot eliminate the possibility that the code will get out of sync with the prose. If you find any discrepancies, please let me know!
The
fn main()
that appears in tests is also to keep therustdoc
happy.
Well, they are. Whilst there isn't anything really new about Rust's module system, it still managed to leave me confused and lost in the woods (metaphorically). This is a record of my current understanding in the hopes that it will help someone stuck in a similar position.
The way I think of modules in most cases is that each file is a neat little box all on its lonesome. If you want to talk about something, you have to bring it into that box, usually by importing it from another box. Mass imports aside, if you read a module's source file from top to bottom, you know about everything that's contained in the module.
The major problem with something like C's #include
is that, aside from increasing compile times, they are hideously leaky. Not only do you have zero control over what you export from a header file, not only are they impossible to contain in general, you can't read them to see what they themselves can access; they can access anything that's previously been included by anything else. This makes attempting to bind, say, the Windows API, a complete and utter nightmare where one header file depends on the contents of another which it doesn't include, but which happens to be included by one of its users, but not others. Yeesh.
In a bizzare way, Rust's module system feels closer to #include
than other module systems I've encountered. Modules in Rust are both neat little boxes and header files pretending to be neat little boxes. Allow me to elucidate.
The first point in my mental model of Rust's module system is this:
#1: A crate stems from a single root module.
(As a quick aside on "crate": if you haven't already internalised this, a crate is simply a Rust module which can be linked to by other crates. For example, there's the libstd
library which is a crate that is the std
module.)
This is different from, say, D where you can combine arbitrarily many modules in addition to the one containing your main
function, with no requirement for them to even be visible from the "main module". With Rust, the root module is contained in whatever source file you hand to rustc
; in the case of Cargo packages, src/lib.rs
and each file in src/bin
are root modules.
Since there is only one root module, every other module has be a descendant of it. This brings in the second bit of slight skew from expectations. In other module systems, you specify which modules you want to access by asking the system to find them for you. For example, in Python:
# I want to access the module "stuff" which exists... somewhere...
import stuff
# The contents are... also somewhere...
This is not how Rust works. In Rust, you don't do this; you don't import a module, you define it.
// The "stuff" module exists **here** and **nowhere else**.
mod stuff {
// And here's what it contains...
pub fn blah() {}
}
fn main() {}
It's an incredibly important distinction. mod
isn't a statement, it's a declaration. It's like struct
or enum
or fn
: you aren't just introducing the name, you're also specifying the contents. Now, for the sake of sanity, you can also do this:
// src/lib.rs
// The "stuff" module exists **here** and **nowhere else**.
mod stuff;
// src/stuff.rs
// And here's what it contains...
pub fn blah() {}
These two are not just similar, they are (modulo byte-level representation) semantically identical. Thus, the second point is this:
#2: A crate's source files are just an organisational convenience for dividing up the root module's source file; nothing more.
This also leads to:
#2a: A module does not exist alone, it exists in the context of the root module.
Which is where this example comes in:
fn f1() {}
mod a {
fn f2() { f1() }
}
fn main() {}
Normally, I would expect this to work because f1
and f2
are in the same file, and f1
is at a wider scope than f2
is. But it doesn't. It fails with:
<anon>:8:19: 8:21 error: unresolved name `f1`.
<anon>:8 fn f2() { f1() }
^~
This doesn't really make any sense until you realise that the above example is exactly equivalent to this:
// src/lib.rs
fn f1() {}
mod a;
// src/a.rs
fn f2() { f1() }
This looks like it should fail because there's nothing in a
to make f1
visible. And that leads us to:
#2b: Each module gets its own, independent symbol table, even if they're in the same file.
So although multi-file crates are really just a single, giant file in disguise (like with #include
s), the scoping rules work as you would expect from a normal "neat little boxes" model. The solution to this is:
pub fn f1() {}
mod a {
use f1;
fn f2() { f1() }
}
fn main() {}
Or, alternately, you can do this:
pub fn f1() {}
mod a {
fn f2() { ::f1() }
}
fn main() {}
But you cannot do this:
pub fn f1() {}
mod a {
use ::f1;
fn f2() { f1() }
// ^~
// error: unresolved name `f1`.
}
fn main() {}
Wait, what? This brings me to the next point:
#3: There are two kinds of paths in Rust:
use
paths and non-use
paths.
use
paths are relative to the root module.- non-
use
paths are relative to the containing module.
(Though use
-less paths is funnier.)
As a consequence of this, how you refer to a given item changes depending on whether you're use
-ing it or not. As a quick summary:
-
For
use
paths:- By default, paths are relative to the root module of the crate.
- Paths cannot begin with
::
.
-
For non-
use
paths:- By default, paths are relative to the current module.
- Paths that begin with
::
are relative to the root module of the crate.
-
For both types:
- Paths that begin with
self::
are relative to the current module. - Paths that begin with
super::
are relative to the parent of the current module.
- Paths that begin with
This explains why, for example, you can have a use libc::c_void;
statement work in any module if you have extern crate libc;
in the root, but specifying a return type as libc::c_void
only works in modules that contain an actual extern crate libc;
statement.
-
A crate stems from a single root module.
-
A crate's source files are just an organisational convenience for dividing up the root module's source file; nothing more.
a. A module does not exist alone, it exists in the context of the root module.
b. Each module gets its own, independent symbol table, even if they're in the same file.
-
There are two kinds of paths in Rust:
use
paths and non-use
paths. The former are relative to the root module, the latter to the current module.
All of the following paths (save those commented out) work:
pub fn f1() {}
mod a {
use f1;
use b::f4;
pub fn f2() { f1() }
fn f3() { f4() }
}
mod b {
pub fn f4() { ::f1() }
fn f5() { super::f1() }
fn f6() { ::a::f2() }
fn f7() { super::a::f2() }
fn f8() { self::f4() }
fn f9() { self::f5() }
}
fn f10() {
a::f2();
self::a::f2();
// a::f3(); // error: function `f3` is private
b::f4();
// b::f5(); // error: function `f5` is private
}
fn main() {}
I know this is older now but I just ran across it as I am just starting to look at Rust. I found a description of the module system that left me completely baffled and started searching for a better explanation and came across this. It has helped a lot.
There is one thing I would like to point out, however, that I think is a bit misleading in your description under #1. Specifically you point to the differences between something like 'import' in Python and 'mod' in Rust. This, I think, is comparing apples and oranges.
The problem is that import doesn't define anything. It makes something that has already been defined available in the current environment. It is much closer to 'extern crate' than it is to 'mod'. In this sense a Python package (which can be imported) is very close to a Rust crate.
So, either I have entirely missed your point (which is possible) or your example doesn't really show what you would like it to show.
Mark