GrabYourPitchforks/csharp_singlecopy_struct.md

## csharp_singlecopy_struct.md

      
    Raw
  

              csharp_singlecopy_struct.md
            
          
    Problem statement and core scenario

We want to introduce the idea of a value type where the underlying data is only ever "live" in at most one place. The canonical example is the internal ValueStringBuilder struct type, which performs internal ArrayPool management.
ValueStringBuilder builder = new ValueStringBuilder(); // VSB is a struct type
builder.Append("foo");
builder.Append(obj);
HelperMethod(ref builder); // builder passed by ref to helper methods
return builder.ToString(); // ToString releases underlying rented arrays back to pool
This would also allow us to expose an allocation-free variant of Utf8JsonWriter. In .NET Core 3.x, we had originally exposed this type as a struct, but before RTM we backtracked and turned it into a proper class due to consumers finding it too easy to pass these struct by value and corrupt the internal state of the writer.
More generally, the problem we're trying to solve is that if a value type is responsible for manual resource management, we don't want the value type's internal state to be duplicated in such a manner that resource management becomes unreliable.
Proposal

The proposal here is inspired by C++'s concept of move-only types like std::unique_ptr<T>. There may be other ways to solve this problem more generally which I have not considered.

This isn't trying to solve the issue of single ownership for all possible types; e.g., "lots of people have a reference to this Stream instance, but who is ultimately responsible for calling Stream.Dispose?"

We can introduce the concept of a "single-copy" type in C#. To be annotated as single-copy, the following attribute is applied to the type.
namespace System.Runtime.CompilerServices
{
    [AttributeUsage(AttributeTargets.Struct, AllowMultiple = false, Inherited = false)]
    public sealed class SingleCopyAttribute : Attribute { }
}

// an example of a single-copy struct
[SingleCopy]
public ref struct MySingleCopyStruct
{
    public MySingleCopyStruct(/* ... */) { }
    public int SomeMethod(/* ... */) { }
    // ...
}
Single-copy types have the following restrictions:

Only ref structs may be marked single-copy. (This implies single-copy types cannot be boxed.)
If an address of a single-copy type is assigned a value, the source of that copy must be considered uninitialized.

The second bullet above is the most interesting, as it introduces several code patterns which are valid for normal structs and ref structs, but invalid for copy-only structs. Consider the following examples.
Examples

[SingleCopy]
public ref struct MySingleCopyStructFoo
{
    public MySingleCopyStructFoo(int foo)
    {
        DoSomethingWith(this); // ERROR: copy of 'this' is made, which means 'this' is now unassigned before ctor returns
    }

    public static void DoSomethingWith(MySingleCopyStructFoo value) { /* ... */ }
}

[SingleCopy]
public ref struct MySingleCopyStructBar
{
    public MySingleCopyStructBar(int foo)
    {
        DoSomethingWith(ref this); // OK: no copy of 'this' is made
        DoSomethingElseWith(this); // OK: copy of 'this' is made, 'this' now unassigned
        this = default;            //     but this line re-assigns before ctor exits
    }

    public static void DoSomethingWith(ref MySingleCopyStructBar value) { /* ... */ }
    public static void DoSomethingElseWith(MySingleCopyStructBar value) { /* ... */ }
}

[SingleCopy]
public ref struct MySingleCopyStructBaz
{
    private int _field;

    public void SomeInstanceMethod()
    {
        this._field = 42; // OK: no copy of 'this'
        Console.WriteLine(this._field); // OK: no copy of 'this'
        this.SomeOtherMutatingInstanceMethod(); // OK: 'this' implicitly passed by ref

        var copy = this; // ERROR: 'this' (passed by ref as arg0 to this method) now unassigned, nonsensical
        Console.WriteLine(copy._field); // OK: no copy of 'copy'
    }

    public void SomeOtherMutatingInstanceMethod()
    {
        this._field = 100; // OK: no copy of 'this'
    }

    public static void SomeStaticMethod(in MySingleCopyStructBaz value)
    {
        value.SomeOtherMutatingInstanceMethod(); // ERROR: implicit copy of 'value' due to readonly -> mutable semantics
    }
}
For advanced scenarios, we could also introduce utility methods to allow power developers to bypass these restrictions as needed.
public static class SingleCopyUtility
{
    // assumes allowing passing ref structs as generic 'T'
    public static T DangerousCopy<T>(ref T value) where T : singlecopy
    { /* ... */ }

    // roughly equivalent to C++'s std::move<T>(T&&)
    public static T Move<T>(ref T value) where T : singlecopy
    {
        T copy = DangerousCopy(ref value);
        value = default;
        return copy;
    }
}
Assumptions


It is nonsensical (and thus forbidden) to consider the target of a ref unassigned. Therefore the following patterns would be illegal in all cases. (We can implement speciality helpers like Move<T> within the runtime.)
void MyMethodFoo(ref MySingleCopyType a, ref MySingleCopyType b)
{
    a = b; // ERROR: setting 'a' is ok, but marking 'b' unassigned is nonsensical, hence forbidden
    b = default; // definite assignment, but doesn't prevent the line above from erroring out
}

void MyMethodBar(ref MySingleCopyType a, ref MySingleCopyType b)
{
    a = Move(ref b); // OK
}

/* following examples consider a struct with a field of a single-copy type */

ref MySingleCopyType MyOtherMethod(ref MySingleCopyOuterType value)
{
    return ref value._mySingleCopyType; // OK: ref manipulation, no copies made
}

MySingleCopyType MyOtherMethod(ref MySingleCopyOuterType value)
{
    return value._mySingleCopyType; // ERROR: marking 'value._mySingleCopyType' as unassigned is nonsensical, hence forbidden
    return Move(ref value._mySingleCopyType); // OK
}


Struct instance methods are modeled as TReturn Method(ref TStruct @this, ...). This allows instance methods to be called over and over sequentially without a copy being made.
MySingleCopyStruct val = new MySingleCopyStruct();
val.InstanceMethod(); // no copy
DoSomethingWithByRef(ref val); // no copy
val.InstanceMethod(); // no copy
DoSomethingWithByVal(val); // copy, 'val' is now unassigned
val.InstanceMethod(); // ERROR: 'val' was marked unassigned per line above


Struct ctors are modeled as instance methods (void .ctor(ref TStruct @this, ...)) or as value-returning methods (TStruct .ctor(...)). This subtle distinction affects which operations are valid within the ctor. For example, if the ctor returns void and operates on an implicit ref this, then no copy of this may be made at all without going through Move<T>(ref T). If the ctor is instead struct-returning, then a single copy of this may be made as long as this is reassigned before the ctor returns.


The compiler and runtime will get support for passing ref structs as the generic T, required for Move<T>(ref T). If this doesn't come in, maybe we introduce a __move keyword or similar. (Ugh.)