withoutboats/modules.md

## modules.md

      
    Raw
  

              modules.md
            
          
    This document is the result of conversations that came out of this tweet: https://twitter.com/withoutboats/status/814201265575981056
Rust's module system is too confusing

Empirically, very many new users of Rust are confused by Rust's module system. This is unfortunate:  Rust's module system is not particularly innovative or powerful; it is intended only to provide fairly standard privacy and namespacing support. Too much of new users' attention is being pulled by this system as it exists today.
This document presents an hypothesis of the cause of the confusion, and an attempt to mitigate that confusion by instituting a practice that is more similar to mainstream languages. We believe this problem is caused by the confluence of a several well-motivated design decisions that have created a very unusual system, and the solution is to require less declarations by leveraging ambient information in a manner more similar to how other languages' module systems work.
Rust requires users to build an explicit module graph

Every language which supports some namespacing necessarily defines a graph of namespaces which are traversed as names are resolved to their canonical origin. Rust takes the uncommon approach of having users explicitly construct the canonical module tree with mod declarations before allowing them to create additional 'facade' paths with use and pub use.
It is more common for languages to implicitly take the canonical tree from the file system hierarchy, and only allow users to import names from other points in the tree into the local namespace with a statement equivalent to use.
Rust allows users to make much more complex statements about the module graph than many other languages. By default, canonical paths are private outside of their immediate module, and must be made private. It also allows creating totally artificial paths to names with re-exports. Any change should preserve this ability.
Some Rust source files are branch nodes in the module graph

In languages with an implicit canonical path tree, it is not uncommon to treat each file as a 'leaf' node of the graph; no other namespaces are within the namespace of any file without explicitly importing them. However, Rust does treat submodules as inherently in scope to their parent module.
Combined with the requirement to explicitly declare modules, this causes a mod declaration like mod foo; to behave similarly to how import foo; would work in many languages. This makes it very difficult for many users to understand what mod even does that is different from use, and it can feel arbitrary when mod is appropriate and when use is.
Rust simply has too much syntax for modules

Rust has essentially four syntactic forms for dealing with the module graph:

mod for declaring a submodule.
extern crate for declaring an external node of the module graph.
use for bringing names into the namespace of the local module.
pub use for creating facade paths to names in the graph.

The distinction between these terms are often subtle and difficult for new users to grasp. It would be advantageous for Rust to simply have less syntax.
Steps to mitigate

Infer mod graph from source directory tree

Even after you understand Rust's module system, the requirement to make mod declarations can feel quite redundant, since the location of the file will be inferred from the filesystem. It makes a lot of sense to allow users to skip this step, which feels like bookkeeping at best and a totemic 'explicitness token' at worst.
When building the canonical paths, walk the entire directory

Instead of only following explicit mod directives while building canonical paths, instead the tree will be constructed from every file in the source directory that matches the Rust moudle tree naming scheme that is already defined. Every file found will be added to canonical module tree as if it had been declared with mod foo. All modules are parsed and walked by macros the same as ever (e.g. by the test harness).
However, if there is no explicit mod foo declaration, the foo module will be treated as an "implicit submodule," not an "explicit submodule."
Treat implicit submodules as not in their parent's namespace

Implicit submodules are not implicitly present in their parent's namespace. This way, there are no names that are in the parent module which are not explicitly declared.
Instead, if a user wishes to access names from implicit submodules, or make them public, they should use a use declaration:
// Instead of:
// mod foo;
// pub mod bar;

use self::foo;
pub use self::bar;
Discourage the use of bodiless mod.

Having done this, bodiless mod declarations will have no purpose. Attributes on modules can be attached at the top of the module file with the #! form.
Inline modules with bodies (as are commonly used for tests modules fo rexample) will continue to exist, and mod will continue to be used for that purpose.
Make extern crate unnecessary with cargo

When adding external dependencies, users using cargo need to declare them in multiple locations - in their Cargo.toml file and in the root of their binary. This is redundant work. Once again, this is either bookkeeping or an 'explicitness token.'
rustc infers extern crate from --extern flags

Instead, when rustc receives an argument of the form --extern NAME=PATH, NAME will implicitly be attached at the canonical root of the module graph, directed to that path. This name has the lowest precedence; it will be shadowed by any declared name in the root of the graph (including by an explicit extern crate declaration).
Cargo supports alias attribute on dependency

Add an alias attribute to the dependency object in a Cargo.toml file. This attribute takes a string, and instead of passing the crate (with the --extern flag) by its actual name, cargo will present the crate to rustc using this alias. This makes extern crate .. as ..; unnecessary.
Result

The result of these changes is that mod and extern crate declarations would become unnecessary, leaving only use and its cousin pub use as namespace management syntax. The only exception to this would be inline submodules, for which the mod keyword is much more intuitive than with multiple files.