Skip to content

Instantly share code, notes, and snippets.

@nikomatsakis
Created November 30, 2011 21:08
Show Gist options
  • Save nikomatsakis/1410825 to your computer and use it in GitHub Desktop.
Save nikomatsakis/1410825 to your computer and use it in GitHub Desktop.
Eliminating implicit copies

The primary goal of this proposal is to eliminate implicit copies of aggregate types, while preserving most other aspects of the language. Secondary goals include: permit references into arrays; make DPS a guaranteed optimization that is obvious from language syntax.

Summary

The following changes are made:

  • vectors become fixed-length arrays;
  • array types are non-copyable; (I am not happy with this)
  • aggregate and unique types must be explicitly copied using the copy keyword (shared pointers and scalars can be implicit);
  • lvalues of aggregate or unique type can only be assigned using a subset of expressions that guarantee the new value is either freshly created (in which case it can be directly written into the lvalue's memory slot) or explicitly copied or moved;
  • functions can designate a new T return type, meaning they return a T which is freshly allocated but where the user decides whether it is shared or unique.

Sources of implicit copies

Implicit copies today occur when one variable is assigned to another and its type is unique or of aggregate value type. Prominent unique types are vectors and strings, of course; this is also a source of some confusion. Vectors and strings are immutable types which are cheap to modify but expensive to share/copy, exactly the opposite of most persistent/immutable types.

Terminology: Value vs reference types

In this document, I am somewhat abusing standard terminology. A reference type is a boxed or unique type. What I am calling value types come in two categories. Simple value types are scalars. All other types are aggregate value types (records, arrays, tuples, tags, and so on).

Originally I thought that tuples should be considered as simple value types as well, but this does not make sense. Consider an aggregate value type T: then a tuple (T,T) must also be an aggregate value type.

I think however that this terminology is altogether broken as "value" and "reference" are really modes and not types, per se. I intend to update the document to be more correct but hopefully for now you, dear reader, will know what I mean.

Fixed-length arrays and dynamically sized types

Rather than vectors, the native type is fixed-length arrays [1, 2, 3]. A fixed length array is represented at runtime by something like the C++ struct:

struct array<T> {
    size_t alloc;
    T[] data;
}

Therefore it is not, itself, a C array, but it's very close. A simple helper could convert it to one, much as we do today.

A by-value array type [T] is unlike other value types in that it does not have a static size. For this reason, a bare array type [T] is only legal in two situations:

  • The type of a by-ref parameter may be an by-value array type [T].
  • The return type of a new function may be an by-value array type [T].

In all other cases, the type must be a reference (i.e., @[T] or ~[T]).

Array creation

Arrays can be created either via literals [v1, v2, v3] or via the use of creation functions. Literals result in an array of the given size which is stored on the stack. There are two creation functions, each of which creates an array of dynamic length:

fn create_val<copy T>(size: uint, v: T) -> new [T];
fn create<copy T>(size: uint, b: block(uint) -> new T) -> new [T];

The function create_val() creates an array of a given size where each value of the array is initially equal to v. The second creates an array where the initial value of the array at index i is given by b(i). The new [T] return type is explained below.

Mutability

As with vectors now, arrays can either be immutable (the default), mutable, or read-only (const). An immutable array cannot be modified by anyone. A read-only array is a reference to an array thay may be mutated by someone else.

Functions returning new arrays may be used to create either mutable or immutable arrays at the callers' choice. For example:

let f: @[int] = @arr::create_val(256, 0);
let g: @[mutable int] = @arr::create_val(256, 0);
let h: ~[int] = ~arr::create_val(256, 0);
let i: ~[mutable int] = ~arr::create_val(256, 0);

Here the same function (arr::create_val()) is used to create arrays of many different types. We could also allow unique arrays to become mutable and immutable if we wished.

Constructor and allocation expressions

A constructor expression is one which inherently constructs a new value that did not exist before. The set of constructor expressions is defined as follows:

CE = { ... }                 (Record literals)
   | ( ... )                 (Tuples)
   | 0, 0u, 'a', 1.2         (Numeric/character literals)
   | "..."                   (Strings)
   | x(...)                  (Function call with appropriate return type)
   | copy E                  (Copy of some other expression E)
   | move x                  (Move of some local variable x)

Any assignment to an lvalue that is either of aggregate value type or a type variable must be done from a constructor expression. This includes returning from a function whose return value is of aggregate value type. Type variables are restricted in this way because they may refer to values of aggregate value type or to values of unique type; in both cases explicit copies and moves are required.

An allocation expression is one which allocates a new value onto some heap (@ or ~). They are defined as follows:

AE = @ AV
   | @ mutable AV            (Type of AV must be copiable)
   | ~ AV
AV = CE
   | [ ... ]                 (Array literal)
   | x(...)                  (Function call with new return type)

AE indicates an allocation expression and AV an allocation value. Each allocation expression consists of an allocation value preceded by some prefix. The @ and ~ prefixes allocate enough space in the appropriate heap for the allocation value. Allocation values are either constructor expressions, in which case the value is constructed in the newly allocated space, array literals, which are allocated into the heap, or calls to a function with new return type (see next section).

It is illegal to use @ mutable when the AV has a non-copyable type. This includes array literals and type variables that are not designed as copyable.

New functions

The return type of a function can be tagged with the keyword new. This indicates that the function will be allocating a (potentially dynamically sized) value and returning it, but it does not specify the heap in which that value will be allocated. That decision is left to the caller. new functions are not compatible non-new functions. That is, the types fn(S) -> T and fn(S) -> new T are distinct. Within a new function, the return keyword must be followed by an allocation value AV, as defined previously. It is a static error to invoke a new function outside of an allocator expression or the ret expression of another new function.

Some examples:

fn make_arr(a: int, b: int, c: int) -> new [int] {
    ret [a, b, c];
}

type T = { a: int, b: int, c: int };
fn make_rec(a: int, b: int, c: int) -> new T {
    ret { a: a, b: b, c: c };
}

A new value can also be obtained by returning the value of another function that has a new return type, such as arr::create() and arr::create_val():

fn map<S,T>(a: [S], f: block(S) -> T) -> new [T] {
    ret arr::create(arr::len(a), { |i| block(a[i]) });
}

Implementation

Today, all Rust functions are compiled with an implicit first parameter which is a pointer to the location to store their return value. new functions would contain an additional implicit parameter: the heap in which to allocate space for their result.

For example, the arr::create() with signature

fn create<T>(size: uint, v: T) -> new [T]

would compile to a C function roughly like:

void create(
    rust_arr **implicit_r, rust_heap *implicit_h, tydesc *T,
    size_t size, void *v)
{
    size_t bytes = size * T->size;
    rust_arr *result = implicit_h->malloc(bytes);
    result->size = size;
    for (size_t i = 0; i < size; i++) {
        memcpy(result->data + i * T->size, v, T->size);
    }
    *implicit_r = result;
}

Possible optimization for word-sized returns

For efficiency, we may choose to also distinguish between functions that always return word-sized values, as C compilers do. Such functions can use the native ABI to return their result (e.g., use the EAX register on i386). However, to avoid a proliferation of user-visible ABIs, we would hide this distinction from users. This means that such functions would also have a generic variant that takes a return value parameter.

Example:

The following Rust function foo():

fn foo() -> int { ret 22; }

would compile to the following C functions:

int foo_direct() { return 22; }
int foo_by_val(int *r) { *r = 22; }

Direct calls to foo() would use foo_direct(). Loading foo() as a value would use foo_by_val(). Note that all new functions always return a word-sized pointer.

Helper functions

To ease the impedance mismatch between new functions and normal functions, it might be useful to create a few helper functions:

fn shared<S,T>(f: fn(S) -> new T) -> (fn(S) -> @T) {
    ret lambda(s: S) { @f(s) }
}

fn unique<S,T>(f: fn(S) -> new T) -> (fn(S) -> ~T) {
    ret lambda(s: S) { ~f(s) }
}

References

When invoking a function with by-ref parameters, users may pass a specific value of a fixed-length array:

type T = { ... };
fn foo(&x: T) { ... }
fn bar(ts: [T]) { foo(ts[0]); }

Because the array cannot be grown or moved about, this is perfectly safe.

Copies and moves

As a sort of escape hatch, the copy and move keywords can be used to convert any expression into a constructor expression. The keyword copy(expr) evaluates the expression expr and then performs a deep copy according to the following rules:

  • Copying a scalar like int just returns the scalar.
  • Copying an aggregate value type T recursively copies the contents of T.
  • Copying an @T type increments the ref count and returns the same pointer.
  • Copying a unique type ~T creates a box on the exchange heap and copies the contents of T into that box.
  • Copying a reference &T simply copies the pointer.
  • Copying a resource is illegal.
  • Copoying an array type [T] is illegal.

The move x expression is similar in some ways to a copy followed by a nullification of the local variable x. The local variable must not be a reference mode parameter or from an enclosing scope. The primary difference is for unique types and resources:

  • Moving a unique type simply copies the pointer and performs no recursive copies.
  • Similarly, moving a resource is legal.

Interaction with generics

The user can never be sure that generic types are not bound to aggregate value types or unique types. Therefore, uses of variables of generic type generally require explicit copy or move annotations; however, the last use optimization described by Marijn means that a variable which is only used once is implicitly moved.

Examples:

fn identity<T>(t: T) -> T { ret t; } // ok because implicitly ret move t;

fn apply<S,T>(s: S, blk: block(S) -> new T) -> new T { 
    ret blk(s); // implicitly: ret blk(move s); 
}

fn apply_twice<copy S,T>(s: S, blk: block(S) -> new T) -> new (T,T) {
   ret (blk(copy(s)), blk(s)); // Note: first use of `s` requires a copy
}

Subtle points concerning soundness

An important constraint is that there cannot be lvalues of dynamically sized type (e.g., arrays) and also that arrays should never be copyable. This is because there is no good way to make them assignable and we do not want to introduce an assignable kind. This constraint is enforced by, first, the limitation that array types are not copyable and, second, limitations on where the expressions that yield an array value type (array literals and new function return types) may be used.

It is possible to write a generic function like this:

fn foo<T>(f: block(int) -> new T)

where T can be mapped to [S]. One might think that foo() could then declare a local variable of type T, which is true, but foo() would be unable to provide a value for that local variable. Invoking f(), for example, is only possible as part of an allocator expression, which would have to specify @ or ~.

One other potential entrance point for lvalues or copies of array type is through by-ref parameters. However, this is not possible because (a) copies between generic types require that the copy be explicit and the type variable be copyable and (b) [T] is not copyable.

The following is a "proof by lots and lots of examples", in which I enumerate various possibilities and show how each leads to a compile error (caveat: obviously not a real proof).

fn foo1<T>(&&x: [int]) {
    let y = x;         // Illegal: copy required
    let z = copy x;    // Illegal: [int] not copyable
}

fn foo2<T>(&&x: T) {
    let y = x;        // Illegal: copy keyword required
}

fn foo3<copy T>(&&x: T) {
    let y = copy x;  // Legal.
}
fn bar3(x: @[int]) {
    foo3(*x);        // Illegal: [int] is not copyable.
}

fn foo4<T>(&x: T, &y: T) {
    x = y;           // Illegal: copy required.
}

fn foo5<copy T>(&x: T, &y: T) {
    x = copy y;      // Legal.
}
fn bar5(x: @[int], y: @[int]) {
    foo5(*x, *y);    // Illegal: [int] is not copyable.
}

Please let me know if you see any potential errors in my reasoning.

Regions

Regions would make it possible to safely allocate values onto the stack, including arrays. The main change, syntactically, is that the AE expressions become:

AE = @ AV | @ mutable AV | ~ AV | & AV 

The & form indicates that the newly allocated value ought to be placed on the stack and a pointer into the stack is returned. Regions ensure that the stack does not grow endlessly within a single function call by using a different region for stack allocations that occur within a loop body.

Regions also enable taking the address of members of an array or of fields in records and so forth using the familiar &lvalue form from C.

Things I do not like

Array types should be copyable. The reason they are not is that it is the easiest way to make arrays play nicely with references.

Rejected but tempting ideas

This section contains some possible extensions. I have rejected them for the moment but perhaps they could be made to work. In each case I listed why I rejected it.

Arrays of varying size but fixed capacity

Currently all arrays must instantly become fully initialized. I am also partial to the idea that an array begins with a fixed capacity but a size of 0. The array could then be appended to until the fixed capacity is reached. Arrays could not be made smaller.

The array creation routines would then all be based on a single primitive creation routine:

fn create_cap<T>(cap: uint) -> new [T];

Arrays would be represented at runtime exactly as vectors are represented today:

struct rust_array<T> {
    size_t fill, alloc;
    T[] data;
}

This would make it possible to implement dynamic vectors in pure Rust without a loss of efficiency, except that there is no way to handle the case where items are removed from the end of a dynamic vector without recopying the entire array. This limitation is why I rejected the idea, figuring it may not be worth the mental complexity.

Dynamically sized records

I want to allow fixed-size arrays as the last member of a record. The record would then become a dynamically sized value type. However, I was unable to come up with a set of restrictions that plays nicely with generics without introducing type kinds. The danger is that users can create local variables of type [T] through the following setup:

type Foo<T> = { x: T };
fn foo<T>(f: Foo<T>) {
    let y = f.x;
}
fn bar() {
    let f: Foo<[int]> = { x: [1, 2, 3] };
    foo(f);
}

One simple option is just to prohibit type variables from being bound to a type of variable size, but that makes a function like the helper shared() defined earlier not possible, and seems a shame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment