ia0/main.md

## main.md

      
    Raw
  

              main.md
            
          
    Writing down my mental model of unsafe

DEPRECATION WARNING: This document is now deprecated. The new version is available
here.
TLDR: All language constructs regarding unsafe Rust (including those that do not exist yet) can
be explained by a single concept: the proof type (see first section). Understanding this concept is
enough to understand unsafe.
I always assumed the content of this document to be obvious, but came to realize while discussing
with others that this was a wrong assumption. This document is thus for me to refer others to it
when I need to explain my mental model of unsafe.
The proof type

Rust doesn't have a formal proof system. Proofs and properties are written in natural language
(usually English). In my mental model, Rust provides a way to make those informal proofs and
properties more explicit in Rust code and types. This is done using the following type:
/// Mathematical proofs for a given set of properties.
///
/// Proofs have no impact on execution. They only matter for type-checking. In particular, proofs
/// are `Copy`, zero-sized, and do not participate in function ABI.
///
/// The properties are never explicitly written. They are always inferred from the safety section of
/// the documentation of the item in which the type is mentioned.
#[lang = "proof"]
pub struct Proof<Properties> {
    /// A proof of the set of properties.
    ///
    /// This field is never explicitly written. It is always inferred from the safety comment of the
    /// statement in which the struct is mentioned.
    pub proof: Properties,
}
In this example, a proof is taken as argument and a new proof is returned:
/// Combines small numbers into a bigger number.
///
/// # Safety
///
/// The proof parameter properties are:
/// - The absolute value of `x` is smaller than or equal to 100.
/// - The absolute value of `y` is smaller than or equal to 100.
///
/// The proof result property is:
/// - The absolute value of the result is smaller than or equal to 1000.
fn combine(x: i32, y: i32, p: Proof) -> (i32, Proof) {
    let r = 3 * x + 2 * y;
    // SAFETY: By `p`, we know that `x` and `y` are between -100 and 100. By arithmetic, we know
    // that `r` is between -500 and 500. By arithmetic, we conclude that `r` is between -1000 and
    // 1000.
    let q = Proof;
    (r, q)
}
Digression (feel free to skip)

Using documentation and comments instead of types and code is mostly for practical reasons.
It is impractical to write types like Proof<"# Safety\n\nThe absolute value of the result is smaller than or equal to 1000."> and code like Proof { proof: "// SAFETY: We know that arguments are small, so some small linear combination is only slightly bigger." }.
However, a consequence of this choice is that it looks like we get some form of type erasure, but
this is only in appearance. Even though the properties are erased from the types, the programmer
must know at runtime the properties of all proofs (both parameters and results). For example, having
a vector of function pointers like Vec<fn(Proof)> would need the programmer to know for each
element, what are the properties expected to hold to call the function, because they may differ from
one element to the other.
Vocabulary

I'll call pre-conditions (or requirements), proofs taken as parameter. And I'll call
post-conditions (or guarantees), proofs that are returned. In particular, when calling a
function, one has to prove the requirements and get back guarantees that can be used to prove future
requirements. Note that pre-conditions may refer to code that happens after the function. For
example, Pin::get_unchecked_mut() has a pre-condition that the result must not be moved, which is
a property about what happens after the function is called. Similarly, a post-condition may refer
to data that used to exist before the function was called. For example, slice::is_sorted() has a
post-condition that the input is sorted if the result is true (assuming the standard library wishes
to expose this post-condition for safety concerns).
I'll call invariants, proofs in any other position (neither parameter nor result of a function
type), for example struct fields and associated constants. Invariants on mutable data must be
updated at the same time the mutable data they refer to is updated. For proof fields, this is
similar to self-referential structs because the proof field references the other fields.
Link with dependent types (feel free to skip)

This mental model comes from dependent type programming languages like Agda and Coq. If the proof
type were formal, it would introduce a dependency from types on runtime values. Dependent types are
rather inconvenient to use with the current state of research, so avoiding them is a reasonable
decision for programming languages that wish to be widely usable. The proof type being informal, in
particular its properties being just informal text, it doesn't formally refer to runtime values thus
breaking this problematic dependency. But this informality is the reason why manual proofs and
manually checking them is required for the type system to be sound. This can be seen as a usability
trade-off: it's easier to write programs but it's harder to check their soundness.
How it relates to unsafe

You might see it coming, but I'll go through the steps for clarity and to avoid misunderstandings
due to subtle aspects. This section can be read in parallel with the documentation of unsafe keyword
in the standard library and the reference, to make
sure they describe the same concepts but through different approaches.
Unsafe functions or methods

Unsafe functions (or methods) are a syntactic sugar for functions taking a proof as their last
parameter. For example, the following function (omitting the documentation):
unsafe fn unchecked_add(x: i32, y: i32) -> i32;
desugars to the following function (the documentation with its safety section is preserved without
change):
fn unchecked_add(x: i32, y: i32, p: Proof) -> i32;
This is a function type translation, so it applies to all unsafe function types (i.e. within
function declarations, function definitions, function pointers, and so even when nested within
another type like Vec<unsafe fn()>).
Other unsafe superpowers

The book also describes 4 other unsafe superpowers:

Dereference a raw pointer
Access or modify a mutable static variable
Implement an unsafe trait
Access fields of unions

We'll assume those to also take a proof as part of the operation, where the properties are defined
by the operation. For example, dereferencing a raw pointer would require a proof that the pointer is
valid for that operation.
Unsafe blocks

Unsafe blocks are a syntactic sugar adding proof objects anywhere an unsafe superpower is used
within the block. For example, the following block (omitting the safety comment):
unsafe { ptr.add(start).read() }
desugars to the following expression (the safety comment is preserved without change and shared
between both proofs):
ptr.add(start, Proof).read(Proof)
Unsafe traits

Unsafe traits are a syntactic sugar for traits with an associated proof. For example, the following
unsafe trait (omitting the documentation):
unsafe trait TrustedLen: Iterator {}
desugars to the following trait (the documentation with its safety section is preserved without
change):
trait TrustedLen: Iterator {
    const P: Proof;
}
Note that the properties of associated proofs are not only about the type implementing the trait,
but also about the implementation of the (possibly inherited) methods, like in this example.
Unsafe impls

Unsafe impls are a syntactic sugar for impls with an associated proof. For example, the following
unsafe impl (omitting the safety comment):
unsafe impl TrustedLen for StepBy<Range<u32>> {}
desugars to the following impl (the safety comment is preserved without change):
impl TrustedLen for StepBy<Range<u32>> {
    const P: Proof = Proof;
}
What unsafe is missing from the model

In the previous section, we translated from unsafe constructs to the mental model. The reverse
translation is not always possible because the mental model is more expressive. In this section, we
will look at a few options to improve this reverse translation.
Why bother improving the reverse translation

Currently, entities that care about soundness wish to review all unsafe code. For those reviews to
be sustainable on code change, it is important to not trigger a review too often while still
triggering each time it is needed. The current approach is to trigger a review when a file
containing the unsafe keyword is modified. This is a good approximation in the sense that if files
with the unsafe keyword are kept small then reviews won't trigger too often. However, it may also
miss some reviews when unsafe code relies on properties outside files with the unsafe keyword.
This can happen for at least the following reasons (each described in their own sub-section below):

Unsafe code relies on the post-condition of a safe function.
Unsafe code relies on an invariant modified (and preserved) by safe functions.

Improving post-conditions

The following example would probably pass an unsafe review:
fn launder(money: Box<Money>) -> NonNull<Money> {
    let washed = Box::into_raw(money);
    // SAFETY: We know that `washed` is non-null because `Box::into_raw()` returns an aligned and
    // non-null pointer by its post-condition.
    unsafe { NonNull::new_unchecked(washed) }
}
However, if the fact that Box::into_raw() returns a non-null pointer was never meant to be a
safety post-condition, then it would be safe to change its behavior to return a null pointer thus
breaking unsafe code. If it was possible to correctly document which post-conditions are about
safety by writing them in the safety section of the documentation, then it would be clear which
properties may be relied on in unsafe code.
This is a contrived example because this function is in the standard library, and one may assume
that the standard library documentation can always be relied upon in unsafe code. However, some
popular crates have almost a "standard library" feel and may be similarly relied upon in unsafe
code. Some of those crates are meant to move to the standard library eventually, but this shows that
the frontier is not always clear and having a clear way to document and track safety post-conditions
could be beneficial.
That said, it is currently possible to simulate safety post-conditions using unsafe traits, but this
is a bit convoluted. For example, the following function with the mental model:
/// Writes a value to a reference and returns the previous value.
///
/// # Safety
///
/// The proof result will contain the following properties:
/// - The result value is the value within the reference before the call.
/// - The value within the reference after the call is the value parameter.
fn replace(ptr: &mut i32, val: i32) -> (i32, Proof) {
    let old = *ptr;
    *ptr = val;
    // SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
    // `val`.
    (old, Proof)
}
could be encoded like this:
/// Helper trait for replace.
///
/// # Safety
///
/// The method must uphold its contract.
unsafe trait ReplaceTrait {
    /// Writes a value to a reference and returns the previous value.
    fn replace(ptr: &mut i32, val: i32) -> i32;
}

/// Helper struct for replace.
///
/// Call using `<ReplaceImpl as ReplaceTrait>::replace(ptr, val)`.
struct ReplaceImpl;

// SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
// `val`.
unsafe impl ReplaceTrait for ReplaceImpl {
    fn replace(ptr: &mut i32, val: i32) -> i32 {
        let old = *ptr;
        *ptr = val;
        old
    }
}
One could instead imagine the following syntax:
/// Writes a value to a reference and returns the previous value.
///
/// # Safety
///
/// After calling this function:
/// - The result value is the value within the reference before the call.
/// - The value within the reference after the call is the value parameter.
fn replace(ptr: &mut i32, val: i32) -> unsafe i32 {
    let old = *ptr;
    *ptr = val;
    // SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
    // `val`.
    old
}
Note how the function type contains the unsafe keyword in an unusual position (it is not part of the
result type but part of the function arrow, see the last paragraph of this sub-section for a longer
discussion). This means that all exit points (the last expression, the return statements including
the ? operator, and the panic points) of the function must be preceded by a safety comment proving
the post-condition. How this could be enforced syntactically is not obvious, in particular for panic
points.
Unsafe code may thus rely on the post-conditions of such functions. And if the behavior of such
functions changes, tooling would detect that an unsafe-related function is being modified and
trigger an unsafe review. The unsafe reviewer would be responsible to make sure the safety
post-condition still holds.
The unusual position of the unsafe keyword in the function type could actually be understood as part
of the result type when it only refers to the function's result. Usually in those cases, it is
possible to create a new type wrapping the result type. This new type would have an invariant
related to the post-condition. A typical example would be Box::into_raw() which among others has a
post-condition that the result is non-null. This function could also return NonNull to encode that
part of the post-condition in the type system.
Improving invariants

Some data structures like Vec rely on an invariant to hold to prove unsafe code. The usual way to
handle this is to define a safety abstraction (e.g. a struct, a module, a file), define the
invariants that must hold within that abstraction, prove them correct by looking at everything under
that abstraction, and finally refer to those invariants when proving unsafe code. A simplified
example could look like this:
/// Vectors of at most 1024 bytes.
// Invariants:
// - The pointer points to an allocation of 1024 bytes and is owned.
// - The length is always smaller than or equal to 1024.
// - The pointed prefix of size length is initialized.
pub struct Vec1024 {
    ptr: *mut u8,
    len: usize,
}

impl Vec1024 {
    pub fn push(&mut self, x: u8) {
        assert!(self.len < 1024);
        // SAFETY: By the struct invariants and check above.
        unsafe { self.ptr.add(self.len).write(x) };
        // [ NOTE: This is a safe operation but needs a safety comment because it modifies and must
        //   preserve the struct invariants. ]
        // SAFETY: By the check above.
        self.len += 1;
    }
}
One could instead imagine a notion of unsafe structs like this:
/// Vectors of at most 1024 bytes.
//
// # Safety
//
// - The pointer points to an allocation of 1024 bytes and is owned.
// - The length is always smaller than or equal to 1024.
// - The pointed prefix of size length is initialized.
pub unsafe struct Vec1024 {
    ptr: *mut u8,
    len: usize,
}

impl Vec1024 {
    pub fn push(&mut self, x: u8) {
        assert!(self.len < 1024);
        // SAFETY: By the struct invariants and check above.
        unsafe { self.ptr.add(self.len).write(x) };
        // SAFETY: By the check above.
        unsafe { self.len += 1 };
    }
}
Notice that updating the length is now an unsafe operation because it modifies (the field of) an
unsafe struct. As such it comes with a safety comment proving the invariant is preserved. An unsafe
struct would desugar to the same struct without unsafe but with a proof field. In this case it would
be:
pub struct Vec1024 {
    ptr: *mut u8,
    len: usize,
    inv: Proof,
}
Unsafe blocks that modify such structs desugar by also modifying the proof. In this case, the length
update would be:
{ self.len += 1; self.inv = Proof };
Another way to see this is that modifying a struct with a proof field must be atomic: the whole
struct content must be updated simultaneously, thus including the proof field. This is how it would
work in dependent type programming languages.
This syntax could be extended to other data types like enums. Additionally, a notion of safe fields
could be added to avoid unsafe blocks to modify them. A safe field must not be referred to in the
safety section of the unsafe struct. This guarantees that their value do not alter the properties
and thus do not invalidate proofs of those properties. Another way to see this is that a safe field
comes with a safe function to set the field:
fn set_foo(&mut self, value: Foo) {
    // SAFETY: The proof is simply copied (and the invariant preserved) because the properties do
    // not mention foo, so the same proof proves them.
    *self = Self { foo: value, ..*self };
}
Recommendations

Lints

It is almost obligatory to enable unsafe-op-in-unsafe-fn which is allowed-by-default up to edition
2021 and warn-by-default starting from edition 2024. Not using this lint will:

give you a wrong mental model conflating unsafe in types (properties) and unsafe in code (proofs)
as described in the RFC,
conflict with undocumented-unsafe-blocks described below,
conflict with multiple-unsafe-ops-per-block described below.

The following lints will help unsafe review further:

undocumented-unsafe-blocks is the most important one. Without it, unsafe reviewers have to reverse
the invariants by reading all the code. Anything non-local should be avoided during reviews.
multiple-unsafe-ops-per-block is related but secondary. Without it, the safety comment may either
(best case scenario) be proving multiple unsafe superpowers being used at the same time resulting
in possible confusion, or (worst case scenario) forgetting to prove one of the unsafe superpowers
being used resulting in the same issue as if undocumented-unsafe-blocks was not enabled.

Finally, unused-unsafe (which is warn-by-default) is the other side of
multiple-unsafe-ops-per-block. Both together ensure that there is a one-to-one correspondence
between the usage of unsafe superpower and the safety comment proving its soundness, thus
simplifying unsafe reviews.
Acknowledgments

Thanks to the wg-formal-methods stream of the rust-lang Zulip chat for early feedback and to the
internals Rust forum for pointing out additional unsafe improvements.