tema3210/0000-move-references.md

## 0000-move-references.md

      
    Raw
  

              0000-move-references.md
            
          
Feature Name: Move references
Start Date: 2021-04-17
RFC PR: rust-lang/rfcs#0000
Rust Issue: rust-lang/rust#0000

Summary

Move reference is a new kind of references that are intended to allow moving the value, but not the memory it's stored in.
Motivation


Make Box less special by creating mechanism (DerefMove) of moving out of a reference.
Solve drop concern of temporary moves and panics.
Macro-free, unsafe free stack pinning.

Guide-level explanation

&move references provide a way to borrow the memory of a binding while preserving the logic of moving its value. 

The type &move T is, in fact, a reference, but unlike other references it allows moving out initialized value of referenced binding.
Core

There are a few types of move references: plain and annotated with !.
About the functionality:


&move T
&move T!


Allows to move out
Obligates to keep initialized, allows to temporary move out


&move T! is a move reference to initialized binding with ability to move from it. In fact it can be viewed as mutable reference. The reasons of creating it are simple:

It doesn't change existing behavior of &mut T.

Allowing to move out a value implies that it is initialized. So referencing an uninitialized binding by &move T or &move T! is prohibited.
This references can be coerced to another kinds of references, this way you can call methods via them.
Calling a method that takes self by value is also allowed - it will result in deinitialization of referenced binding.
Reference-level explanation

Creation

There are two ways of creating move references:

reference a local binding via syntax &move ...
It is also possible to create move reference to a member of a binding referenced by another such reference: as simple as some_reference.some_field - this will produce move reference to the field of referenced binding.

Coercions to other reference kinds


for all T &move T! can be coerced to &mut T and thus further down to &T.

Interaction with patterns:

We introduce a new pattern kind ref move NAME: this produces NAME of type &move T!. The reason of the ! obligation is that we may not want to left a binding (partially) deinitialized after execution of a pattern-matching construct.
DerefMove trait

I also propose design of DerefMove:
trait DerefMove {
  type Output;

  fn deref(&move self!) -> &move Self::Output!;
}
The reason of such design is that we may not want allow pattern matching to partially deinitialize a binding, as it will require way more complex analysis as well as will give pattern matching powers that do not align well with its original purpose (expressing "high level" logic).
The Box implementation:

struct Box<T>{
  ptr: *mut T,
}

...

impl<T> DerefMove for Box<T> {
  type Output = T;

  fn deref(&move self!) -> &move T! {
    self.ptr as &move T! //just cast the pointer to a reference
  }
}

Aliasing:

Given that all move references are intended to modify referenced binding they all must be unique as &mut T is.
Interaction with panics:

&move .., panics and drops

The representation of a move reference may include not only the pointer itself, but also a bitfield storing information of whether anything was moved out of reference or not. 

This allows to get rid of concerns about drops of uninitialized data during panics.
I guess, this may look like:
#[repr(C)]
struct MoveRef<T> {
  ptr: *mut T, //pointer.
  flags: MoveFlagsOf<T>, //of course, this is not real type, but a kind of intrinsic rather.
}
Also, due to reference being unique there is no need in bitfield being atomic.
All changes to a bitfield happen right after corresponding move. To avoid issue with panic.
The issue with panics is that they may interrupt modification of referred binding thus resulting in inconsistent state. But this is also true for &mut references, so it may cause only logical bugs.
Interaction with leaks

As of these two kinds of move references can't corrupt any state if not used, leaking them should be completely fine.
Methods

As a Self type

These references are explicitly intended to refer to a binding of a type, not just a value. Thus, calling a method taking &move T! as self can only be done on mutable binding, not arbitrary value.
Calling a methods on move references

Methods that take self by value will deinitialize referred binding.
Calling any other methods trigger coercion to a less strict reference kind.
Reasoning about usage of move references

Restrictions

All move references are unique, they may not be duplicated.
The main point in their lifecycle is function boundary. 

At it all move references passed to a function are assumed to hold its invariants.
In order to not run into threading problems move references may not be Send nor Sync.
In the result, if references are properly used in each consumer function, then overall usage of each such reference is in turn correct. (No multi-thread non-determinism here)
The second reason for them to not be Send is that in case of thread crush for what-ever reason we can't be sure that something has been initialized from another thread or is what we are going to deinitialize alive on another thread.
Scopes and analysis

In any scope of the program, move references created as described above must fulfil their obligations, if any. This means that any data structure holding such a reference is required to use the move reference.
This in instance means that &move T!, if something was moved from it, must be initialized back in the same scope in all possible branches. Analysis also must take into account diverging expressions: move reference have to be initialized before return, loop {} and loop {..} resolving to uninhabited types. break is included in the list only if a move reference was created inside a loop that a particular break breaks.
The panic!(), however is not included - we have plenty of operations that can panic and we don't want to initialize a value before each of these.
!Unpin types

Due to the fact that moving a value of !Unpin type most likely will corrupt the data, we may not want it to be moved into and from a binding via such a reference.
Pin, DerefMove and stack pinning

The impls are following:
impl<P> DerefMove for Pin<P>
  where P: DerefMove,
{
  type Output = P::Output;

  fn deref(&move self!) -> &move Self::Output! {
    self.pointer
  }
}

impl<P> Pin<P>
  where P: DerefMove
{
  pub fn new_move(ptr: P) -> Pin<P> {
    Pin {pointer: ptr}
  }
}
An example of use:
...
fn main() {
  let g = make_some_not_unpin_gen();
  let pinned = Pin::new_move(&move g!);
  //work with it!
}
...
Optimization

General

Another key property of move references, is that their usage implies moving the value in and out: this is the perfect case for GCE.
We do GCE for &move .. references. 

In this case mentioned earlier move flags of a reference should live on caller's stack.
In prospect of panics

Let's consider a following example:
struct A{
  hello: String,
  world: String,
}
fn use_it(trg: &move A!) {
  let mut local = *trg; //here we move in function's scope
  if (local.hello != "Hello") {
    panic!("The first 'Hello world' word isn't `Hello`!");
  };
  local.world = "world is OK";
  *trg = local; //init back, OK
}
This example doesn't introduce any problems, unless moves are optimized out. However, with optimization panic will interrupt modification of referenced binding resulting in inconsistent state.
A solution I would regard as optimal is to conditionally turn off copy elision for such cases.
Drawbacks


This adds an entire kind of references. We'll need to teach this.
Further hardens leaking problem.

Rationale and alternatives

The feature serves one need: moving a value but not the memory.
Alternatives are either on the library level or in previous proposals.
Prior art


"?Uninit types [exist today]. Also let’s talk about DerefMove" internals topic - the main source of inspiration.


Older move reference proposals - history of a feature.


Pre-RFC of the feature - discussion in Pre-RFC stage.


Leak and ImplicitDrop(~Destroy) traits - traits to express part of semantics of move references. See future possibilities.


Unresolved questions


Is the way the DerefMove trait is defined here right?
How exactly we should balance usage of GCE in prospect of panics?
It is worthy to go that far in prospect of leaks in generic code?

Future possibilities

&move T* kind

This kind of move references obligates to move in referenced binding, doesn't require it to be initialized.
Currently, we have no traits to describe mandatory operations on this kind (it's, in fact, a refined type).
Introducing this would require also Leak and !ImplicitDrop auto traits to describe things correctly.
The reason of not introducing this is that we could not fix soundness issues by only turning off GCE.
Partial initialization (views)

Partial initialization of a binding of a known type C can be described via following syntax: &move C(a!,b*,c,...).
An example:
struct C {
  a: String,
  b: String,
  c: String,
  d: u32,
}

/// ...Promises to init `b`, keep `a` and uninit `c`, doesn't touch `d` at all.
fn work(arg: &move C(a!,b*,c,.d)) { //dot prefixed `d` may have been omitted.
  let mut tmp = arg.a; //we moved the String to `tmp`
  tmp.append(&arg.c) //we may not move the 'arg.c', but we haven't gave a promise to initialize it back.

  arg.a = tmp; //we initialized `arg.a` back; removing this line is hard error.

  arg.b = "init from another function!".into();

  //println!("{:?}",arg.d ); //error: use of possibly uninitialized value.
}

fn main() {
  let trg: C;
  trg.a = "Hello ".into();
  trg.c = " Hola".into();

  work(&move trg);
  println!(&trg.b); //legal, as work gave a promise to initialize
  println!(&trg.a); //legal
  //println!(&trg.c); //error: use of definitely uninitialized value.

}
Tuples

Syntax of &move references with partial initialization of a tuples is following:
Given a tuple (u32,i64,String,&str) the move reference syntax is like: &move (.u32,i64,String!,&str*) - note the dot prefixed u32 - it will not be touched by a consumer of a reference, but is here to distinguish different tuple types from one another (in cases of named structures untouched fields are simply not mentioned).