nikomatsakis/no-copies.md

## no-copies.md

      
    Raw
  

              no-copies.md
            
          
    The primary goal of this proposal is to eliminate implicit copies of
aggregate types, while preserving most other aspects of the language.
Secondary goals include: permit references into arrays; make DPS a
guaranteed optimization that is obvious from language syntax.
Summary

The following changes are made:

vectors become fixed-length arrays;
array types are non-copyable; (I am not happy with this)
aggregate and unique types must be explicitly copied using the copy keyword
(shared pointers and scalars can be implicit);
lvalues of aggregate or unique type can only be assigned using a
subset of expressions that guarantee the new value is either freshly
created (in which case it can be directly written into the lvalue's
memory slot) or explicitly copied or moved;
functions can designate a new T return type, meaning they return a
T which is freshly allocated but where the user decides whether it
is shared or unique.

Sources of implicit copies

Implicit copies today occur when one variable is assigned to another
and its type is unique or of aggregate value type.  Prominent unique
types are vectors and strings, of course; this is also a source of
some confusion.  Vectors and strings are immutable types which are
cheap to modify but expensive to share/copy, exactly the opposite of
most persistent/immutable types.
Terminology: Value vs reference types

In this document, I am somewhat abusing standard terminology.  A
reference type is a boxed or unique type.  What I am calling value
types come in two categories.  Simple value types are scalars.  All
other types are aggregate value types (records, arrays, tuples, tags,
and so on).
Originally I thought that tuples should be considered as simple value
types as well, but this does not make sense.  Consider an aggregate
value type T: then a tuple (T,T) must also be an aggregate value
type.
I think however that this terminology is altogether broken as "value"
and "reference" are really modes and not types, per se.  I intend to
update the document to be more correct but hopefully for now you, dear
reader, will know what I mean.
Fixed-length arrays and dynamically sized types

Rather than vectors, the native type is fixed-length arrays
[1, 2, 3].  A fixed length array is represented at runtime by
something like the C++ struct:
struct array<T> {
    size_t alloc;
    T[] data;
}

Therefore it is not, itself, a C array, but it's very close.  A simple
helper could convert it to one, much as we do today.
A by-value array type [T] is unlike other value types in that it
does not have a static size.  For this reason, a bare array type [T]
is only legal in two situations:

The type of a by-ref parameter may be an by-value array type [T].
The return type of a new function may be an by-value array type [T].

In all other cases, the type must be a reference (i.e., @[T] or ~[T]).
Array creation

Arrays can be created either via literals [v1, v2, v3] or via the
use of creation functions.  Literals result in an array of the given
size which is stored on the stack.  There are two creation functions,
each of which creates an array of dynamic length:
fn create_val<copy T>(size: uint, v: T) -> new [T];
fn create<copy T>(size: uint, b: block(uint) -> new T) -> new [T];

The function create_val() creates an array of a given size where
each value of the array is initially equal to v.  The second creates
an array where the initial value of the array at index i is given by
b(i).  The new [T] return type is explained below.
Mutability

As with vectors now, arrays can either be immutable (the default),
mutable, or read-only (const).  An immutable array cannot be
modified by anyone.  A read-only array is a reference to an array thay
may be mutated by someone else.
Functions returning new arrays may be used to create either mutable
or immutable arrays at the callers' choice.  For example:
let f: @[int] = @arr::create_val(256, 0);
let g: @[mutable int] = @arr::create_val(256, 0);
let h: ~[int] = ~arr::create_val(256, 0);
let i: ~[mutable int] = ~arr::create_val(256, 0);

Here the same function (arr::create_val()) is used to create arrays of
many different types.  We could also allow unique arrays to become
mutable and immutable if we wished.
Constructor and allocation expressions

A constructor expression is one which inherently constructs a new
value that did not exist before.  The set of constructor expressions
is defined as follows:
CE = { ... }                 (Record literals)
   | ( ... )                 (Tuples)
   | 0, 0u, 'a', 1.2         (Numeric/character literals)
   | "..."                   (Strings)
   | x(...)                  (Function call with appropriate return type)
   | copy E                  (Copy of some other expression E)
   | move x                  (Move of some local variable x)

Any assignment to an lvalue that is either of aggregate value type or
a type variable must be done from a constructor expression.  This
includes returning from a function whose return value is of aggregate
value type.  Type variables are restricted in this way because they
may refer to values of aggregate value type or to values of unique
type; in both cases explicit copies and moves are required.
An allocation expression is one which allocates a new value onto
some heap (@ or ~).  They are defined as follows:
AE = @ AV
   | @ mutable AV            (Type of AV must be copiable)
   | ~ AV
AV = CE
   | [ ... ]                 (Array literal)
   | x(...)                  (Function call with new return type)

AE indicates an allocation expression and AV an allocation value.
Each allocation expression consists of an allocation value preceded by
some prefix.  The @ and ~ prefixes allocate enough space in the
appropriate heap for the allocation value.  Allocation values are
either constructor expressions, in which case the value is constructed
in the newly allocated space, array literals, which are allocated into
the heap, or calls to a function with new return type (see next section).
It is illegal to use @ mutable when the AV has a non-copyable
type.  This includes array literals and type variables that are not
designed as copyable.
New functions

The return type of a function can be tagged with the keyword new.
This indicates that the function will be allocating a (potentially
dynamically sized) value and returning it, but it does not specify the
heap in which that value will be allocated.  That decision is left to
the caller. new functions are not compatible non-new functions.
That is, the types fn(S) -> T and fn(S) -> new T are distinct.
Within a new function, the return keyword must be followed by an
allocation value AV, as defined previously. It is a static error to
invoke a new function outside of an allocator expression or the
ret expression of another new function.
Some examples:
fn make_arr(a: int, b: int, c: int) -> new [int] {
    ret [a, b, c];
}

type T = { a: int, b: int, c: int };
fn make_rec(a: int, b: int, c: int) -> new T {
    ret { a: a, b: b, c: c };
}

A new value can also be obtained by returning the value of another
function that has a new return type, such as arr::create() and
arr::create_val():
fn map<S,T>(a: [S], f: block(S) -> T) -> new [T] {
    ret arr::create(arr::len(a), { |i| block(a[i]) });
}

Implementation

Today, all Rust functions are compiled with an implicit first
parameter which is a pointer to the location to store their return
value.  new functions would contain an additional implicit
parameter: the heap in which to allocate space for their result.
For example, the arr::create() with signature
fn create<T>(size: uint, v: T) -> new [T]

would compile to a C function roughly like:
void create(
    rust_arr **implicit_r, rust_heap *implicit_h, tydesc *T,
    size_t size, void *v)
{
    size_t bytes = size * T->size;
    rust_arr *result = implicit_h->malloc(bytes);
    result->size = size;
    for (size_t i = 0; i < size; i++) {
        memcpy(result->data + i * T->size, v, T->size);
    }
    *implicit_r = result;
}

Possible optimization for word-sized returns

For efficiency, we may choose to also distinguish between functions
that always return word-sized values, as C compilers do. Such
functions can use the native ABI to return their result (e.g., use the
EAX register on i386).  However, to avoid a proliferation of
user-visible ABIs, we would hide this distinction from users. This
means that such functions would also have a generic variant that takes
a return value parameter.

Example:
The following Rust function foo():
fn foo() -> int { ret 22; }

would compile to the following C functions:
int foo_direct() { return 22; }
int foo_by_val(int *r) { *r = 22; }

Direct calls to foo() would use foo_direct().  Loading foo()
as a value would use foo_by_val(). Note that all new functions
always return a word-sized pointer.

Helper functions

To ease the impedance mismatch between new functions and normal
functions, it might be useful to create a few helper functions:
fn shared<S,T>(f: fn(S) -> new T) -> (fn(S) -> @T) {
    ret lambda(s: S) { @f(s) }
}

fn unique<S,T>(f: fn(S) -> new T) -> (fn(S) -> ~T) {
    ret lambda(s: S) { ~f(s) }
}

References

When invoking a function with by-ref parameters, users may pass a specific
value of a fixed-length array:
type T = { ... };
fn foo(&x: T) { ... }
fn bar(ts: [T]) { foo(ts[0]); }

Because the array cannot be grown or moved about, this is perfectly
safe.
Copies and moves

As a sort of escape hatch, the copy and move keywords can be used
to convert any expression into a constructor expression.  The keyword
copy(expr) evaluates the expression expr and then performs a deep
copy according to the following rules:

Copying a scalar like int just returns the scalar.
Copying an aggregate value type T recursively copies the contents of T.
Copying an @T type increments the ref count and returns the same
pointer.
Copying a unique type ~T creates a box on the exchange heap and
copies the contents of T into that box.
Copying a reference &T simply copies the pointer.
Copying a resource is illegal.
Copoying an array type [T] is illegal.

The move x expression is similar in some ways to a copy followed by
a nullification of the local variable x.  The local variable must
not be a reference mode parameter or from an enclosing scope. The
primary difference is for unique types and resources:

Moving a unique type simply copies the pointer and performs no
recursive copies.
Similarly, moving a resource is legal.

Interaction with generics

The user can never be sure that generic types are not bound to
aggregate value types or unique types.  Therefore, uses of variables
of generic type generally require explicit copy or move
annotations; however, the last use optimization described by Marijn
means that a variable which is only used once is implicitly moved.

Examples:
fn identity<T>(t: T) -> T { ret t; } // ok because implicitly ret move t;

fn apply<S,T>(s: S, blk: block(S) -> new T) -> new T { 
    ret blk(s); // implicitly: ret blk(move s); 
}

fn apply_twice<copy S,T>(s: S, blk: block(S) -> new T) -> new (T,T) {
   ret (blk(copy(s)), blk(s)); // Note: first use of `s` requires a copy
}


Subtle points concerning soundness

An important constraint is that there cannot be lvalues of dynamically
sized type (e.g., arrays) and also that arrays should never be
copyable.  This is because there is no good way to make them
assignable and we do not want to introduce an assignable kind.  This
constraint is enforced by, first, the limitation that array types are
not copyable and, second, limitations on where the expressions that
yield an array value type (array literals and new function return
types) may be used.
It is possible to write a generic function like this:
fn foo<T>(f: block(int) -> new T)

where T can be mapped to [S].  One might think that foo() could
then declare a local variable of type T, which is true, but foo()
would be unable to provide a value for that local variable.  Invoking
f(), for example, is only possible as part of an allocator
expression, which would have to specify @ or ~.
One other potential entrance point for lvalues or copies of array type
is through by-ref parameters. However, this is not possible because
(a) copies between generic types require that the copy be explicit and
the type variable be copyable and (b) [T] is not copyable.
The following is a "proof by lots and lots of examples", in which I
enumerate various possibilities and show how each leads to a compile
error (caveat: obviously not a real proof).
fn foo1<T>(&&x: [int]) {
    let y = x;         // Illegal: copy required
    let z = copy x;    // Illegal: [int] not copyable
}

fn foo2<T>(&&x: T) {
    let y = x;        // Illegal: copy keyword required
}

fn foo3<copy T>(&&x: T) {
    let y = copy x;  // Legal.
}
fn bar3(x: @[int]) {
    foo3(*x);        // Illegal: [int] is not copyable.
}

fn foo4<T>(&x: T, &y: T) {
    x = y;           // Illegal: copy required.
}

fn foo5<copy T>(&x: T, &y: T) {
    x = copy y;      // Legal.
}
fn bar5(x: @[int], y: @[int]) {
    foo5(*x, *y);    // Illegal: [int] is not copyable.
}

Please let me know if you see any potential errors in my reasoning.
Regions

Regions would make it possible to safely allocate values onto the
stack, including arrays.  The main change, syntactically, is that the
AE expressions become:
AE = @ AV | @ mutable AV | ~ AV | & AV 

The & form indicates that the newly allocated value ought to be
placed on the stack and a pointer into the stack is returned.  Regions
ensure that the stack does not grow endlessly within a single function
call by using a different region for stack allocations that occur
within a loop body.
Regions also enable taking the address of members of an array or of
fields in records and so forth using the familiar &lvalue form from
C.
Things I do not like

Array types should be copyable.  The reason they are not is that it is
the easiest way to make arrays play nicely with references.
Rejected but tempting ideas

This section contains some possible extensions.  I have rejected them
for the moment but perhaps they could be made to work.  In each case I
listed why I rejected it.
Arrays of varying size but fixed capacity

Currently all arrays must instantly become fully initialized.  I am
also partial to the idea that an array begins with a fixed capacity
but a size of 0.  The array could then be appended to until the fixed
capacity is reached.  Arrays could not be made smaller.
The array creation routines would then all be based on a single primitive
creation routine:
fn create_cap<T>(cap: uint) -> new [T];

Arrays would be represented at runtime exactly as vectors are
represented today:
struct rust_array<T> {
    size_t fill, alloc;
    T[] data;
}

This would make it possible to implement dynamic vectors in pure Rust
without a loss of efficiency, except that there is no way to handle
the case where items are removed from the end of a dynamic vector
without recopying the entire array.  This limitation is why I rejected
the idea, figuring it may not be worth the mental complexity.
Dynamically sized records

I want to allow fixed-size arrays as the last member of a record.
The record would then become a dynamically sized value type.  However,
I was unable to come up with a set of restrictions that plays nicely
with generics without introducing type kinds.  The danger is that users
can create local variables of type [T] through the following setup:
type Foo<T> = { x: T };
fn foo<T>(f: Foo<T>) {
    let y = f.x;
}
fn bar() {
    let f: Foo<[int]> = { x: [1, 2, 3] };
    foo(f);
}

One simple option is just to prohibit type variables from being bound
to a type of variable size, but that makes a function like the helper
shared() defined earlier not possible, and seems a shame.