Skip to content

Instantly share code, notes, and snippets.

@ia0
Last active March 17, 2024 16:08
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ia0/820ab50d4c5f0f5e3aeb841cef8e6792 to your computer and use it in GitHub Desktop.
Save ia0/820ab50d4c5f0f5e3aeb841cef8e6792 to your computer and use it in GitHub Desktop.
Writing down my mental model of unsafe

Writing down my mental model of unsafe

DEPRECATION WARNING: This document is now deprecated. The new version is available here.

TLDR: All language constructs regarding unsafe Rust (including those that do not exist yet) can be explained by a single concept: the proof type (see first section). Understanding this concept is enough to understand unsafe.

I always assumed the content of this document to be obvious, but came to realize while discussing with others that this was a wrong assumption. This document is thus for me to refer others to it when I need to explain my mental model of unsafe.

The proof type

Rust doesn't have a formal proof system. Proofs and properties are written in natural language (usually English). In my mental model, Rust provides a way to make those informal proofs and properties more explicit in Rust code and types. This is done using the following type:

/// Mathematical proofs for a given set of properties.
///
/// Proofs have no impact on execution. They only matter for type-checking. In particular, proofs
/// are `Copy`, zero-sized, and do not participate in function ABI.
///
/// The properties are never explicitly written. They are always inferred from the safety section of
/// the documentation of the item in which the type is mentioned.
#[lang = "proof"]
pub struct Proof<Properties> {
    /// A proof of the set of properties.
    ///
    /// This field is never explicitly written. It is always inferred from the safety comment of the
    /// statement in which the struct is mentioned.
    pub proof: Properties,
}

In this example, a proof is taken as argument and a new proof is returned:

/// Combines small numbers into a bigger number.
///
/// # Safety
///
/// The proof parameter properties are:
/// - The absolute value of `x` is smaller than or equal to 100.
/// - The absolute value of `y` is smaller than or equal to 100.
///
/// The proof result property is:
/// - The absolute value of the result is smaller than or equal to 1000.
fn combine(x: i32, y: i32, p: Proof) -> (i32, Proof) {
    let r = 3 * x + 2 * y;
    // SAFETY: By `p`, we know that `x` and `y` are between -100 and 100. By arithmetic, we know
    // that `r` is between -500 and 500. By arithmetic, we conclude that `r` is between -1000 and
    // 1000.
    let q = Proof;
    (r, q)
}

Digression (feel free to skip)

Using documentation and comments instead of types and code is mostly for practical reasons.

It is impractical to write types like Proof<"# Safety\n\nThe absolute value of the result is smaller than or equal to 1000."> and code like Proof { proof: "// SAFETY: We know that arguments are small, so some small linear combination is only slightly bigger." }.

However, a consequence of this choice is that it looks like we get some form of type erasure, but this is only in appearance. Even though the properties are erased from the types, the programmer must know at runtime the properties of all proofs (both parameters and results). For example, having a vector of function pointers like Vec<fn(Proof)> would need the programmer to know for each element, what are the properties expected to hold to call the function, because they may differ from one element to the other.

Vocabulary

I'll call pre-conditions (or requirements), proofs taken as parameter. And I'll call post-conditions (or guarantees), proofs that are returned. In particular, when calling a function, one has to prove the requirements and get back guarantees that can be used to prove future requirements. Note that pre-conditions may refer to code that happens after the function. For example, Pin::get_unchecked_mut() has a pre-condition that the result must not be moved, which is a property about what happens after the function is called. Similarly, a post-condition may refer to data that used to exist before the function was called. For example, slice::is_sorted() has a post-condition that the input is sorted if the result is true (assuming the standard library wishes to expose this post-condition for safety concerns).

I'll call invariants, proofs in any other position (neither parameter nor result of a function type), for example struct fields and associated constants. Invariants on mutable data must be updated at the same time the mutable data they refer to is updated. For proof fields, this is similar to self-referential structs because the proof field references the other fields.

Link with dependent types (feel free to skip)

This mental model comes from dependent type programming languages like Agda and Coq. If the proof type were formal, it would introduce a dependency from types on runtime values. Dependent types are rather inconvenient to use with the current state of research, so avoiding them is a reasonable decision for programming languages that wish to be widely usable. The proof type being informal, in particular its properties being just informal text, it doesn't formally refer to runtime values thus breaking this problematic dependency. But this informality is the reason why manual proofs and manually checking them is required for the type system to be sound. This can be seen as a usability trade-off: it's easier to write programs but it's harder to check their soundness.

How it relates to unsafe

You might see it coming, but I'll go through the steps for clarity and to avoid misunderstandings due to subtle aspects. This section can be read in parallel with the documentation of unsafe keyword in the standard library and the reference, to make sure they describe the same concepts but through different approaches.

Unsafe functions or methods

Unsafe functions (or methods) are a syntactic sugar for functions taking a proof as their last parameter. For example, the following function (omitting the documentation):

unsafe fn unchecked_add(x: i32, y: i32) -> i32;

desugars to the following function (the documentation with its safety section is preserved without change):

fn unchecked_add(x: i32, y: i32, p: Proof) -> i32;

This is a function type translation, so it applies to all unsafe function types (i.e. within function declarations, function definitions, function pointers, and so even when nested within another type like Vec<unsafe fn()>).

Other unsafe superpowers

The book also describes 4 other unsafe superpowers:

  • Dereference a raw pointer
  • Access or modify a mutable static variable
  • Implement an unsafe trait
  • Access fields of unions

We'll assume those to also take a proof as part of the operation, where the properties are defined by the operation. For example, dereferencing a raw pointer would require a proof that the pointer is valid for that operation.

Unsafe blocks

Unsafe blocks are a syntactic sugar adding proof objects anywhere an unsafe superpower is used within the block. For example, the following block (omitting the safety comment):

unsafe { ptr.add(start).read() }

desugars to the following expression (the safety comment is preserved without change and shared between both proofs):

ptr.add(start, Proof).read(Proof)

Unsafe traits

Unsafe traits are a syntactic sugar for traits with an associated proof. For example, the following unsafe trait (omitting the documentation):

unsafe trait TrustedLen: Iterator {}

desugars to the following trait (the documentation with its safety section is preserved without change):

trait TrustedLen: Iterator {
    const P: Proof;
}

Note that the properties of associated proofs are not only about the type implementing the trait, but also about the implementation of the (possibly inherited) methods, like in this example.

Unsafe impls

Unsafe impls are a syntactic sugar for impls with an associated proof. For example, the following unsafe impl (omitting the safety comment):

unsafe impl TrustedLen for StepBy<Range<u32>> {}

desugars to the following impl (the safety comment is preserved without change):

impl TrustedLen for StepBy<Range<u32>> {
    const P: Proof = Proof;
}

What unsafe is missing from the model

In the previous section, we translated from unsafe constructs to the mental model. The reverse translation is not always possible because the mental model is more expressive. In this section, we will look at a few options to improve this reverse translation.

Why bother improving the reverse translation

Currently, entities that care about soundness wish to review all unsafe code. For those reviews to be sustainable on code change, it is important to not trigger a review too often while still triggering each time it is needed. The current approach is to trigger a review when a file containing the unsafe keyword is modified. This is a good approximation in the sense that if files with the unsafe keyword are kept small then reviews won't trigger too often. However, it may also miss some reviews when unsafe code relies on properties outside files with the unsafe keyword.

This can happen for at least the following reasons (each described in their own sub-section below):

  • Unsafe code relies on the post-condition of a safe function.
  • Unsafe code relies on an invariant modified (and preserved) by safe functions.

Improving post-conditions

The following example would probably pass an unsafe review:

fn launder(money: Box<Money>) -> NonNull<Money> {
    let washed = Box::into_raw(money);
    // SAFETY: We know that `washed` is non-null because `Box::into_raw()` returns an aligned and
    // non-null pointer by its post-condition.
    unsafe { NonNull::new_unchecked(washed) }
}

However, if the fact that Box::into_raw() returns a non-null pointer was never meant to be a safety post-condition, then it would be safe to change its behavior to return a null pointer thus breaking unsafe code. If it was possible to correctly document which post-conditions are about safety by writing them in the safety section of the documentation, then it would be clear which properties may be relied on in unsafe code.

This is a contrived example because this function is in the standard library, and one may assume that the standard library documentation can always be relied upon in unsafe code. However, some popular crates have almost a "standard library" feel and may be similarly relied upon in unsafe code. Some of those crates are meant to move to the standard library eventually, but this shows that the frontier is not always clear and having a clear way to document and track safety post-conditions could be beneficial.

That said, it is currently possible to simulate safety post-conditions using unsafe traits, but this is a bit convoluted. For example, the following function with the mental model:

/// Writes a value to a reference and returns the previous value.
///
/// # Safety
///
/// The proof result will contain the following properties:
/// - The result value is the value within the reference before the call.
/// - The value within the reference after the call is the value parameter.
fn replace(ptr: &mut i32, val: i32) -> (i32, Proof) {
    let old = *ptr;
    *ptr = val;
    // SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
    // `val`.
    (old, Proof)
}

could be encoded like this:

/// Helper trait for replace.
///
/// # Safety
///
/// The method must uphold its contract.
unsafe trait ReplaceTrait {
    /// Writes a value to a reference and returns the previous value.
    fn replace(ptr: &mut i32, val: i32) -> i32;
}

/// Helper struct for replace.
///
/// Call using `<ReplaceImpl as ReplaceTrait>::replace(ptr, val)`.
struct ReplaceImpl;

// SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
// `val`.
unsafe impl ReplaceTrait for ReplaceImpl {
    fn replace(ptr: &mut i32, val: i32) -> i32 {
        let old = *ptr;
        *ptr = val;
        old
    }
}

One could instead imagine the following syntax:

/// Writes a value to a reference and returns the previous value.
///
/// # Safety
///
/// After calling this function:
/// - The result value is the value within the reference before the call.
/// - The value within the reference after the call is the value parameter.
fn replace(ptr: &mut i32, val: i32) -> unsafe i32 {
    let old = *ptr;
    *ptr = val;
    // SAFETY: The initial value of `ptr` is saved in `old` and we unconditionally overwrite it with
    // `val`.
    old
}

Note how the function type contains the unsafe keyword in an unusual position (it is not part of the result type but part of the function arrow, see the last paragraph of this sub-section for a longer discussion). This means that all exit points (the last expression, the return statements including the ? operator, and the panic points) of the function must be preceded by a safety comment proving the post-condition. How this could be enforced syntactically is not obvious, in particular for panic points.

Unsafe code may thus rely on the post-conditions of such functions. And if the behavior of such functions changes, tooling would detect that an unsafe-related function is being modified and trigger an unsafe review. The unsafe reviewer would be responsible to make sure the safety post-condition still holds.

The unusual position of the unsafe keyword in the function type could actually be understood as part of the result type when it only refers to the function's result. Usually in those cases, it is possible to create a new type wrapping the result type. This new type would have an invariant related to the post-condition. A typical example would be Box::into_raw() which among others has a post-condition that the result is non-null. This function could also return NonNull to encode that part of the post-condition in the type system.

Improving invariants

Some data structures like Vec rely on an invariant to hold to prove unsafe code. The usual way to handle this is to define a safety abstraction (e.g. a struct, a module, a file), define the invariants that must hold within that abstraction, prove them correct by looking at everything under that abstraction, and finally refer to those invariants when proving unsafe code. A simplified example could look like this:

/// Vectors of at most 1024 bytes.
// Invariants:
// - The pointer points to an allocation of 1024 bytes and is owned.
// - The length is always smaller than or equal to 1024.
// - The pointed prefix of size length is initialized.
pub struct Vec1024 {
    ptr: *mut u8,
    len: usize,
}

impl Vec1024 {
    pub fn push(&mut self, x: u8) {
        assert!(self.len < 1024);
        // SAFETY: By the struct invariants and check above.
        unsafe { self.ptr.add(self.len).write(x) };
        // [ NOTE: This is a safe operation but needs a safety comment because it modifies and must
        //   preserve the struct invariants. ]
        // SAFETY: By the check above.
        self.len += 1;
    }
}

One could instead imagine a notion of unsafe structs like this:

/// Vectors of at most 1024 bytes.
//
// # Safety
//
// - The pointer points to an allocation of 1024 bytes and is owned.
// - The length is always smaller than or equal to 1024.
// - The pointed prefix of size length is initialized.
pub unsafe struct Vec1024 {
    ptr: *mut u8,
    len: usize,
}

impl Vec1024 {
    pub fn push(&mut self, x: u8) {
        assert!(self.len < 1024);
        // SAFETY: By the struct invariants and check above.
        unsafe { self.ptr.add(self.len).write(x) };
        // SAFETY: By the check above.
        unsafe { self.len += 1 };
    }
}

Notice that updating the length is now an unsafe operation because it modifies (the field of) an unsafe struct. As such it comes with a safety comment proving the invariant is preserved. An unsafe struct would desugar to the same struct without unsafe but with a proof field. In this case it would be:

pub struct Vec1024 {
    ptr: *mut u8,
    len: usize,
    inv: Proof,
}

Unsafe blocks that modify such structs desugar by also modifying the proof. In this case, the length update would be:

{ self.len += 1; self.inv = Proof };

Another way to see this is that modifying a struct with a proof field must be atomic: the whole struct content must be updated simultaneously, thus including the proof field. This is how it would work in dependent type programming languages.

This syntax could be extended to other data types like enums. Additionally, a notion of safe fields could be added to avoid unsafe blocks to modify them. A safe field must not be referred to in the safety section of the unsafe struct. This guarantees that their value do not alter the properties and thus do not invalidate proofs of those properties. Another way to see this is that a safe field comes with a safe function to set the field:

fn set_foo(&mut self, value: Foo) {
    // SAFETY: The proof is simply copied (and the invariant preserved) because the properties do
    // not mention foo, so the same proof proves them.
    *self = Self { foo: value, ..*self };
}

Recommendations

Lints

It is almost obligatory to enable unsafe-op-in-unsafe-fn which is allowed-by-default up to edition 2021 and warn-by-default starting from edition 2024. Not using this lint will:

The following lints will help unsafe review further:

  • undocumented-unsafe-blocks is the most important one. Without it, unsafe reviewers have to reverse the invariants by reading all the code. Anything non-local should be avoided during reviews.
  • multiple-unsafe-ops-per-block is related but secondary. Without it, the safety comment may either (best case scenario) be proving multiple unsafe superpowers being used at the same time resulting in possible confusion, or (worst case scenario) forgetting to prove one of the unsafe superpowers being used resulting in the same issue as if undocumented-unsafe-blocks was not enabled.

Finally, unused-unsafe (which is warn-by-default) is the other side of multiple-unsafe-ops-per-block. Both together ensure that there is a one-to-one correspondence between the usage of unsafe superpower and the safety comment proving its soundness, thus simplifying unsafe reviews.

Acknowledgments

Thanks to the wg-formal-methods stream of the rust-lang Zulip chat for early feedback and to the internals Rust forum for pointing out additional unsafe improvements.

@JarredAllen
Copy link

Clippy lint that I like that's relevant to this: multiple_unsafe_ops_per_block. It does exactly what you're asking for in "improving unsafe granularity", limiting each unsafe block to contain at most one unsafe operation.

I always turn it on when I'm writing new unsafe code, just to make sure that nothing slips past without me realizing that it's unsafe.

@ia0
Copy link
Author

ia0 commented Feb 24, 2024

Thanks a lot @JarredAllen! I wasn't aware of this lint and it's exactly what I was describing. I've updated the document (in revision 9) with a recommendation regarding lints simplifying unsafe reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment