Skip to content

Instantly share code, notes, and snippets.

@tema3210
Created May 5, 2021 11:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tema3210/c2cffd6fd6c15b482496f4895ff5dd40 to your computer and use it in GitHub Desktop.
Save tema3210/c2cffd6fd6c15b482496f4895ff5dd40 to your computer and use it in GitHub Desktop.
Move refs RFC rev 6.

Summary

Move references are two new kinds of references that are intended to allow moving the value, but not the memory it's stored in. Currently proposes not only references, but also a OwningIterator and DerefMoves traits.

Introduces a new implied auto trait called UnwindSound.

Motivation

  • Make Box less special by creating mechanisms (DerefMoves) of moving out of a reference.
  • Solve drop concern of temporary moves and panics.
  • Macro-free, unsafe free stack pinning.
  • Owning iterators.

Guide-level explanation

&move references provide a way to borrow the memory of a binding while preserving the logic of moving its value.
The type &move T is, in fact, a reference, but unlike other references it allows moving out initialized value of referenced binding.

Core

There are a few types of move references: plain and annotated with !.

About the functionality:

&move T &move T!
Allows to move out Obligates to keep initialized, allows to temporary move out

&move T! is a move reference to initialized binding with ability to move from it. In fact it can be viewed as mutable reference. The reasons of creating it are simple:

  • It doesn't change existing behavior of &mut T.

Allowing to move out a value implies that it is initialized. So referencing an uninitialized binding by &move T or &move T! is prohibited.

This references can be coerced to another kinds of references, this way you can call methods via them.

Calling a method that takes self by value is also allowed - it will result in deinitialization of referenced binding.

These references are not Send nor Sync.

Usage examples

Stack pinning

let mut gen = make_some_not_unpin_gen();
let pinned_gen = Pin::new_move_init(&move gen!); //here we pin a move reference to it. No unsafe or macro.
//use pinned gen.

Owning iteration

let v: Vec<Struct> = get_a_vec(); //we also got a memory.
for i in v.own_iter() { // `own_iter` takes `&move Vec<_>!` and produces an `impl OwningIterator<Item=_>`
..
}

Such Self type is chosen because we don't want to consume the bookkeping part of vec.

Owning slicing

let sl: [u8;6] = [1,2,3,4,5,6];
let (lh,rh): (&move [u8],&move [u8]) = sl.split_own_at(3); //method from slice; takes ownership as written ((&move sl).split_own_at(3) is a desugared version).
assert_eq!((3,3),(lh.len(),rh.len()));

Another example:

let toks = [Token::new(),Token::new()]; //assume they are not `Copy`, nor Clone().

match &move toks[..]! {
  [borrowed,another_one] => {
    //use the token without restiction, but don't forget to put them back.
  }
}

Reference-level explanation

Creation

There are three ways of creating move references:

  • Reference a local binding via syntax &move ...
  • It is also possible to create move reference to a member of a binding referenced by another such reference: as simple as some_reference.some_field - this will produce move reference to the field of referenced binding.
  • Reborrow from existing move reference, possibly using DerefMoves.

Optimization

General

Another key property of move references, is that their usage implies moving the value in and out: this is the perfect case for GCE.

We do GCE for &move .. references.
In this case mentioned earlier move flags of a reference should live on caller's stack.

Coercions to other reference kinds

  • for all T: &move T! can be coerced to &mut T and thus further down to &T.

Interaction with patterns:

We introduce a new pattern kind ref move NAME!: this produces NAME of type &move T!.

Another new pattern is ref move NAME (note the absense of exclamation mark): this produces NAME of type &move T.

DerefMove traits

I also propose design of DerefMoves:

trait DerefMove: DerefMut {
  fn deref_move(&move self) -> &move <Self as Deref>::Output;
}
trait DerefMoveInit: DerefMut {
  fn deref_move_init(&move self!) -> &move <Self as Deref>::Output!;
}

The reason for two trait is that there are 2 kinds of move references with different use cases.

The Box implementation:

struct Box<T>{
  ptr: *mut T,
};
//...
impl<T> DerefMoveInit for Box<T> {
  fn deref_move_init(&move self!) -> &move T! {
    unsafe { self.ptr as &move T! } //just cast the pointer to a reference
  }
}
impl<T> DerefMove for Box<T> {
  fn deref_move(&move self) -> &move T {
    unsafe { self.ptr as &move T } //just cast the pointer to a reference
  }
}

The now unstable box keyword syntax usage now can also be written as:

...
let b: Box<C> = Box::new(..);
match b {
  ref move smth! => { //this internally calls `deref_move_init`
    //here we have `smth: &move C!`;
  }
};
match b {
  ref move smth => { //here we consume the box; this internally calls `deref_move`
    //here we have smth: &move SMTH;
  }
};
//b.method() //error since we have consumed box.
...

Aliasing:

Given that all move references are intended to modify referenced binding they all must be unique as &mut T is.

Interaction with panics:

&move .., panics and drops

The representation of a move reference may include not only the pointer itself, but also a bitfield storing information of whether anything was moved out of reference or not.
This allows to get rid of concerns about drops of uninitialized data during panics.

I guess, this may look like:

#[repr(C)]
struct MoveRef<T> {
  ptr: *mut T, //pointer.
  flags: MoveFlagsOf<T>, //of course, this is not real type, but a kind of intrinsic rather.
}

Also, due to reference being unique there is no need in bitfield being atomic.

All changes to a bitfield happen right after corresponding move. To avoid issue with panic.

The issue with panics is that they may interrupt modification of referred binding thus resulting in inconsistent state. But this is also true for &mut references, so it may cause only logical bugs.

UnwindSound auto-trait

To avoid really bad things that panics can expose to end user, I propose new implied unsafe auto trait: UnwindSound (This may be a bad name, but still). It's unsafe by the reason that it's trait for structural reasoning implemented and used by compiler. It's not supposed to be implemented by hand.

The rule is simple: any type whose parts all implement UnwindSound is also UnwindSound.

This trait is, in fact, a strict version of UnwindSafe and can be relied upon for safety.

catch_unwind in turn requires it (it's implied after all).

&move T!, however, doesn't implement this trait.

Motivation of doing this is to forbid &move T! references to cross unwinding border, and thus make it impossible to observe uninitialized bindings.

This means they can't be used in any already existing data structures, unless they are explicitly opted into via T: ?UnwindSound bound.

Interaction with leaks

As of these two kinds of move references can't corrupt any state if not used, leaking them should be completely fine.

Methods

As a Self type

These references are explicitly intended to refer to a binding of a type, not just a value. Thus, calling a method taking &move T! as self can only be done on mutable binding, not arbitrary value.

Calling a methods on move references

Methods that take self by value will deinitialize referred binding.

Calling any other methods trigger coercion to a less strict reference kind.

Reasoning about usage of move references

Restrictions

All move references are unique, they may not be duplicated.

The main point in their lifecycle is function boundary.
At it all move references passed to a function are assumed to hold their invariants.

In order to not run into threading problems move references may not be Send nor Sync.

In the result, if references are properly used in each consumer function, then overall usage of each such reference is in turn correct. (No multi-thread non-determinism here)

The second reason for them to not be Send is that in case of thread crush for what-ever reason we can't be sure that something has been initialized from another thread or is what we are going to deinitialize alive on another thread.

Scopes and analysis

In any scope of the program, move references created as described above must fulfil their obligations, if any. This means that any data structure holding such a reference is required to use the move reference. Any move reference that is in use at a moment may not leave a scope.

This in instance means that &move T!, if something was moved from it, must be initialized back in the same scope in all possible branches. The reference is also not allowed to leave that scope.

Analysis also must take into account diverging expressions: move reference have to be initialized before return and loop {..} resolving to uninhabited types. break is included in the list only if a move reference was created inside a loop that a particular break breaks.

Infinite loop{} is also not included because uninit. state won't be observable by anyone ever.

The panic!(), however ,is not included - we have plenty of operations that can panic and we don't want to initialize a value before each of them.

Pin, DerefMoves and stack pinning

The impls are following:

impl<P: DerefMove> DerefMove for Pin<P>
  where <P as Deref>::Target: Unpin,
{
  fn deref_move(&move self) -> &move Self::Output {
    &move *self.pointer
  }
}

impl<P: DerefMoveInit> DerefMoveInit for Pin<P>
  where <P as Deref>::Target: Unpin,
{
  fn deref_move_init(&move self!) -> &move Self::Output! {
    &move *self.pointer!
  }
}

impl<P> Pin<P>
  where P: DerefMove
{
  pub fn new_move(ptr: P) -> Pin<P> {
    Pin {pointer: ptr}
  }
}

impl<P> Pin<P>
  where P: DerefMoveInit
{
  pub fn new_move_init(ptr: P) -> Pin<P> {
    Pin {pointer: ptr}
  }
}

An example of use:

...
fn main() {
  let g = make_some_not_unpin_gen();
  let pinned = Pin::new_move_init(&move g!);
  //work with it!
}
...

OwningIterator trait

trait OwningIterator {
  type Item;

  fn next(&move self!) -> Option<&move Self::Item>;
}

Reason of such Self type in next method is that we may not want to deinitialize the binding whose value we iterate over (it's IntoIter purpose). Another kind of move reference used in return type is meant to give away the value, not borrow it and thus not intersect with streaming iterators, another awaited thing.

Slice's implemetation

...
struct SliceOwnIter<'a,T> { //It's not `UnwindSound` btw.
  trg: Option<&'a move [T]>,
}

impl<'a,T> OwningIterator for SliceOwnIter<'a,T> {
  type Item = T;

  fn next(&move self!) -> Option<&'a move T> { // this shortens vec's
    if let Some(sl) = self.trg { //we moved from referenced binding => we MUST place a value back.
      match sl[..] { //we don't have to reborrow - it's already reference
        [fst, rest @ ..] => {
          *self.trg = Some(rest); //moved_back
          Some(fst)
        },
        [last] => {
            *self.trg = None; //moved back
            Some(last)
        },
        [] => {
          *self.trg = None; //moved back
          None
        }
      }
    } else {
      None
    }
  }

}

Various helper methods

Slice

impl<T> [T] {
  pub fn split_own_at(&move self,index: usize) -> (&move [T],&move [T]) {
    assert!(index <= self.len());
    //SAFETY: we already checked index to be inside of `0..=self.len()` range.
    unsafe {split_own_at_unchecked(self,index)}
  }
  //not `pub` for consistency
  unsafe fn split_own_at_unchecked(&move self,index: usize) -> (&move [T],&move [T]) {
    (self[0..index],self[index..])
  }

  //and others
}

This will require extending an unsafe SliceIndex<T> trait.

Drawbacks

  • This adds two entire kinds of references. We'll need to teach to this.
  • These two kinds are required to be explicitly supported by data-structures.
  • Requires new implied auto-trait.

Rationale and alternatives

Impact of proposed auto-trait is expected to be minimal:

  • First, it's implied, meaning that no work in any library is required at all. Although, support for &move T! references is required explicitly.
  • Second, it's not implemented only for move references.
  • Aside from this proposal, but this allow to opt out panic safety: ?UnwindSound => users can better describe their types.

The main alternative is to strip the RFC down to contain only &move T kind, moving controversial &move T! into future possibilities. However, I don't think that it's worthy iteration, because it doesn't make it easy to do stack pinning and doesn't allow OwningIterator.

Another big alternative is to include &move T* kind MVP to tell a better story about self-referential values. First downside is that it requires explicit demarkation of self references (because I don't see how else we could tell compiler anything like "This reference will point inside of the struct itself") - another proposal. The second - we cannot enforce correct use of move references and !Unpin types.

Final alternative to mention is just include everything required for all 3 kinds to be sound fully fledged types: Leak,ImplicitDrop implied auto-traits, Destroy trait for finalization - in fact, linear types. It's too much for one RFC.

Other alternatives are either on the library level or in previous proposals.

Prior art

Unresolved questions

  • Do we want implicit syntax of creating a move reference, like:
fn a(&move B!){...};
fn main(){
  let b: B = Default::default();a(b)//it creates move reference implicitly.
}
  • OwnerIterator for Vec...

Future possibilities

&move T* kind

This kind of move references obligates to move in referenced binding, doesn't require it to be initialized.

Currently, we have no traits to describe mandatory operations on this kind (it's, in fact, a refined type).

Introducing this would require also Leak and !ImplicitDrop auto traits to describe things correctly.

We would need to enforce moving in (referred binding) something:

  • In generic code, we will need to describe such finalization.
  • In face of panics, what to place into a binding if panic has occured?

The reason of not introducing this is that we could not fix soundness issues by only turning off GCE.

In patterns

Something like ref move NAME*.

MVP

As alternative, we could introduce this kind as "not a type", in terms that it may not participate in data structures, nor as a generic parameter, nor as an argument type. &move T* references may only be created via patterns and must be used in the same scope.

The use case of MVP is manipulating self references.

Unmovable types and values

GCE and !Unpin values

If we guarantee copy\move elision for &move .. references, then we are able to not move a !Unpin value, IOW work with it in place.

The way of working with !Unpin values I imagine is just destructuring a move reference to it resulting in bunch move references to contents.

The biggest issue, however, is about how to deal when one referenced part refers to another, that is also referenced.

My guess is that we could provide a 'self lifetime that actually tells the compiler that reference will point inside of a struct it's contained in.

During destructure, observing self references might break aliasing rules and thus &move T* kind MVP would be required to allow working with such parts of a value.

This will need GCE to be sound.

Another thing to mention is that we can't enforce correct use of !Unpin values:

  • Pin can be implemented for a wrapper of !Unpin type and is safe trait, meaning such cannot be ruled out by unsafe contract.
  • Because of this we can't really make move references "Unpin safe".

Partial initialization (views)

Partial initialization of a binding of a known type C can be described via following syntax: &move C(a!,b*,c,...).

An example:

struct C {
  a: String,
  b: String,
  c: String,
  d: u32,
}

/// ...Promises to init `b`, keep `a` and uninit `c`, doesn't touch `d` at all.
fn work(arg: &move C(a!,b*,c,.d)) { //dot prefixed `d` may have been omitted.
  let mut tmp = arg.a; //we moved the String to `tmp`
  tmp.append(&arg.c) //we may not move the 'arg.c', but we haven't gave a promise to initialize it back.

  arg.a = tmp; //we initialized `arg.a` back; removing this line is hard error.

  arg.b = "init from another function!".into();

  //println!("{:?}",arg.d ); //error: use of possibly uninitialized value.
}

fn main() {
  let trg: C;
  trg.a = "Hello ".into();
  trg.c = " Hola".into();

  work(&move trg);
  println!(&trg.b); //legal, as work gave a promise to initialize
  println!(&trg.a); //legal
  //println!(&trg.c); //error: use of definitely uninitialized value.

}

Tuples

Syntax of &move references with partial initialization of a tuples is following:

Given a tuple (u32,i64,String,&str) the move reference syntax is like: &move (.u32,i64,String!,&str*) - note the dot prefixed u32 - it will not be touched by a consumer of a reference, but is here to distinguish different tuple types from one another (in cases of named structures untouched fields are simply not mentioned).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment