Skip to content

Instantly share code, notes, and snippets.

@ubsan ubsan/alias.md Secret
Last active Aug 3, 2017

Embed
What would you like to do?

Aliasing

Aliasing is fundamentally about if a binding is allowed to "see" a mutation from another binding; if the mutation of the referred to object of one binding can change the object referred to by a different binding.

First, some definitions:

  • location => the place where an object is
  • binding => a reference, a pointer, or an identifier which refers to an object
    • safe binding: locals, references which are not shared references to UnsafeCell, custom standard library pointers like Box and Rc; the magic wording is due to Unique and Shared being magic. There needs to be a way to define reference-like types in libraries, but that's the scope of the language, not of the formalization.
    • raw binding: raw pointers
    • &(object containing UnsafeCell) is a special case - the parts of the object which are not contained in an UnsafeCell are treated as a safe binding, the parts which are are treated as a raw binding. This is not true of &mut (object containing UnsafeCell), which is always a safe binding. This same dichotomy is extended to Unique types and Shared types - Unique is treated like &mut, Shared is treated like &.

The aliasing relation is as follows:

a $ b where a and b are objects: accesses through a "see" accesses through b; another way of wording this is that b is allowed to alias a.

a $$ b where a and b are objects: this is equivalent to a $ b, but the relation only exists for raw pointers. $$ is the "raw alias" relation.

Notes:

  • Many of the operators are unchecked for type safety in this where it is unnecessary; as does no checking, for example, under this formalism. It's not important for what this paper sets out to do.
  • This model does not make a distinction between objects and values. This is an extremely important distinction, but is not necessary for this paper and only adds to the difficulty of reading it.
-- These are implicitly defined; outside the scope of this paper
data RustType: Type where
data Object: RustType -> Type where

infixl 5 $
infixl 5 $$

-- one can think of `($)` as a sort of reverse "derived-from" bound
-- aka "Alias"
data ($): (a: Object ta) -> (b: Object tb) -> Type where
  Reflexive: (a: Object ta) -> a $ a
  Transitive: a $ b -> b $ c -> a $ c
  -- other, nontrivial constructors exist

-- aka "RawAlias"
data ($$): (a: Object ta) -> (b: Object tb) -> Type where
  Alias: a $ b -> a $$ b
  Sibling: a $$ b -> a $$ c -> b $$ c

SwapRawAlias: a $$ b -> b $$ a
SwapRawAlias (Alias (Reflexive c)) = Alias (Reflexive c)
SwapRawAlias (Alias (Transitive ra rb)) =
  let ra' = SwapRawAlias (Alias ra) in
  let rb' = Alias rb in
  Sibling rb' ra'
SwapRawAlias (Sibling x y) = Sibling y x

From this definition, you can see that $ is a partial order, while $$ is an equivalence relation.

An object may Alias another object if they are the same object, or if the other object is derived-from the first. There are a few ways to do this in Rust:

-- x as tb
As: (x: Object ta) -> (tb: RustType) -> (y: Object tb ** x $ y)
-- reborrow
Reborrow: (x: Object ta) -> (tb: RustType) -> (y: Object tb ** x $ y)
-- move
Move: (x: Object ta) -> (y: Object ta ** x $ y)

notes:

  • This formalism is not complete; it's simply the outline of the beginning of a formalism. There are issues with it, and it may end up eating your laundry.

Accesses

There are two types of accesses on an object a:

read:

  • match expression on a subobject of a
  • if expression on a subobject of a
  • rvalue conversion on a subobject of a

write:

  • = on a subobject of a
  • OP= on a subobject of a, where that subobject is an object of fundamental type

One may access an object "through" a binding - this means to use the binding to access the referred object.

Given a binding a, and a binding b, where !(a $ b), a refers to a subobject of b, and neither are raw bindings:

  • after one writes through b, any access through a is undefined behavior.
  • after one accesses through a, any write to b is undefined behavior.

The same is true for raw bindings, except instead of the ($) operator, the ($$) operator is used.

notes:

  • A binding does not "check for undefined behavior" until it exists. The creation of a binding allows it to see all accesses that it's parent could see at the time of its creation.

Examples

// note: the scopes are unnecessary for the semantics, but we don't
// yet have NLLs
fn main() {
  let mut x = (0, 0);
  {
    let y = &mut x;
    // x $ y, borrow of the lexpr x gives y
    {
      let z = &mut y.0;
      // desugared to
      // let z = &mut (*y).0;
      // ((*y).0) $ z through the borrow rule
      // (*y) $ ((*y).0) through the field access rule
      // y $ (*y) through the dereference rule
      // x $ y is already proven
      // therefore, x $ z
      *z = 10;
    }
    println!("{:?}", *y);
    // valid, because y $ z
  }
  println!("{:?}", x);
  // valid, because x $ z
}
fn main() {
  let x = &mut 0;
  {
    let y: *mut _ = x;
    // desugared to let y = (&mut *x) as *mut i32;
    // ((&mut *x) as *mut i32) $ y
    // (&mut *x) $ ((&mut *x) as *mut i32)
    // (*x) $ (&mut *x)
    // x $ (*x)
    // -> x $ y
    unsafe { *y = 1029; }
  }
  println!("{}", *x);
  // valid because x $ y
}
fn main() {
  let x = &mut 0;
  let p1 = x as *mut i32;
  // x $ p1
  let p2 = x as *mut i32;
  // x $ p2

  // *however*, !(p1 $ p2)

  unsafe {
    let r1 = &mut *p1;
    // x $ r1
    let r2 = &mut *p2;
    // x $ r2

    // *** !(r1 $ r2) ***
    *r1 = 10;
    println!("{}", *r2); // UB!!
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.