Skip to content

Instantly share code, notes, and snippets.

@lrhn
Last active October 27, 2022 14:45
Show Gist options
  • Save lrhn/14cb77f2c822d918484e129bd175a375 to your computer and use it in GitHub Desktop.
Save lrhn/14cb77f2c822d918484e129bd175a375 to your computer and use it in GitHub Desktop.
Dart Struct-on-records specification strawman

Dart Structs

A proposal for a replacement for views and introducing a zero-overhead “value-type”.

Syntax

The proposed syntax is

<structDeclaration> ::=
  'struct' <typeName> <typeArgs>?<parameterList> ('is' <type> (',' <type>)*) <structBody>
<structBody> ::= 
    ';' 
  | '{' memberDeclaration* '}'

For example:

struct StructName<S, T>(int x, int y, {Color color = Color.black}) 
    is OtherStruct<S>, yetOtherStruct<T>;

The is clause and TypeArgs are optional, the parameterList can be empty.

Members can be any static member, any non-instance-variable instance member, and any factory or redirecting named constructor.

The “primary constructor” syntax used as part of the struct declaration is mandatory, it defines the structure used to store the arguments.

Static semantics

A declaration of

struct Name<TypeParams>(parameterList) is StructName1<TypeArgs1>, StructName2<TypeArgs2> {
   members
}

introduces a new type, Name.

It’s a compile-time error if StructName1/StructType2 does not denote a struct type (directly or through expanding a type alias).

The representation type of Name is a type which depends on the parameter list.

If the parameter list contains a single positional parameter, the representation type of Name is the declared type of that parameter.

Otherwise the representation type is a record type with the same structure and types as the parameter list (positional parameters to positional record fields in the same order and with the same types, named parameters to named record fields with the same name and type, and ignoring whether the parameters are optional or not.)

The type Name is a subtype of the types of StructName1<TypeArgs1> and StructName2<TypeArgs2>. If Name is generic, then Name<TypeArgs> is an immediate subtype of StructName1<TypeArgs1> and StructName2<TypeArgs2> with type arguments inserted appropriately.

A struct type’s representation type must be a subtype of its super-struct-types’ representation types. It’s a compile-time error if the representation type of Name is not a subtype of the representation type of StructName1<TypeArgs1> and StructName2<TypeArgs2>. (This uses proper subtyping, and does not assume type parameters to be covariant. It’s possible for type parameters to be contravariant or invariant, e.g., if they occur in function types in the resulting representation type. If we get variance annotations, we should require the type parameters to be annotated correctly.)

If the is clause is omitted, the only immediate supertype of Name is Object.

The struct declaration can serve as a container for static members and constructors, which can be accessed as Name.staticMemberName, as usual.

The usual rules against naming conflicts apply, including for the implicit getters of the primary constructor. A struct must also not declare any member with the same name as a member of Object. (Of these, == and hashCode of the representation type works well, but lacking a way to override toString might be an issue.) Finally, a struct must not declare any instance variables.

The primary constructor syntax introduces an unnamed const constructor with the same parameter list, and it introduces getters for each parameter, with the same name and type. It’s a compile-time error if members includes an unnamed constructor or a non-redirecting generative constructor.

Redirecting generative constructors and factory constructors work as usual.

We resolve an instance member name against a struct type, to either a struct instance member declaration, a conflict, or to nothing, as follows:

  • If the struct type directly declares a member with that name, then that member is the result.

  • If the struct implicitly declares a getter with that name, for a default constructor parameter, then that implicit declaration is the result.

  • Otherwise, recursively do a parallel resolution of the member name against every immediate struct-super-type (the names of the is clause), if any.

    • A parallel resolution of a member name against a collection of struct types is performed as follows:

    • Resolve the member against each struct type.

    • If each of those resolutions results in nothing (including the case where the collection of struct types is empty), the result is nothing.

    • If any struct-types resolve the member to a conflict (if we allow that), then the result is a conflict.

    • If there is no conflict, but two struct-types resolve the member to different declarations, then the result is also a conflict.

    • If there is no conflict, and precisely one unique member declaration in the results (even if the same member is the result for multiple struct-types), then that member declaration is the result.

    • Otherwise, every struct-type resolves the member to nothing, and the result is nothing.

  • The result of the struct member resolution is then the result of that parallel struct member resolution.

For a member invocation, e.m, if the static type of e is a struct type S, let n be the name of the member being invoked. Perform member resolution of n on S.

  • If the result is nothing, then the static type does not provide an implementation of n. (We then go on to check if an extension declaration applies, and if not, a compile-time error occurs).
  • If the result is a conflict, a compile-time error occurs.
  • Otherwise the result is a particular struct instance member declaration. Type inference continues using the signature of that member as the signature being invoked.

Inside the body of a struct instance member declaration, this has the current struct’s type as static type.

Inside the body of a struct instance member declaration, a super-invocation, super.m, performs parallel member resolution on the immediate super-struct-types of the current struct. It’s a compile-time error if that is nothing or a conflict. Otherwise it is treated as invoking the resulting struct member.

(We can choose to make it a compile-time error if any member name would resolve to a conflict against a struct type. That would mean that the struct had to introduce a member itself to shadow any conflicts between its super-struct-types. We can also choose to not do anything, and only make it a problem if someone tries to use a conflicted member.)

Runtime semantics

The struct type is erased at runtime, replaced by its representation type.

We can say that at runtime, we treat struct Name(args) as if it was typedef Name = RepresentationType; instead of being a new nominative type.

(Alternative: At runtime, the struct type and its representation types are mutual subtypes. This retains the name, for debugging, but treats the types as indistinguishable in runtime type checks.)

Invoking the unnamed (primary) constructor of a struct type behaves as follows:

  • It creates a record with the structure of the representation type.
  • Each field of that record is filled with the argument value passed to the primary constructor (using the default value if an argument is omitted).
  • The constructor invocation then evaluates to that record.

Any runtime subtype check (is, as, try/on, dynamic invocations) uses the representation type instead of the struct type.

Invoking the implicit getter of a primary constructor parameter extracts the corresponding field from the representation value, or returns the representation value directly if there was only one positional parameter.

Invoking a member on an expression whose static type was a struct type, means evaluating the receiver to a value, v of the representation type (the representation object), the argument list, if not a getter, to a list of arguments, then the member declaration that the member name resolved to (there must be one, otherwise the invocation would have been a compile-time error) is invoked with this bound to v and parameters bound to the list of arguments.

Similarly, invoking super.m inside a struct instance member invokes the struct member it resolves to with the argument list.

Consequences

This design can be used for “views” as well as “value types”.

The subtyping requirements ensure that a List<SubStruct> is a List<SuperStruct> at runtime too, when the struct type is erased to the representation type.

The “one positional parameters avoids record-wrapping” special case is convenient for view-behavior, but does feel a little special. It also means that:

struct Foo(int x, int y) {}
struct Bar((int, int) p) is Foo {}

will be valid, because they have the same representation type. That’s actually likely to be convenient in some cases.

We get == and hashCode for free, because we use those of the record.

We also get the record toString, which we might not be as happy about. It’s still a valid representation of the state of the struct.

We could allow is NonStructType as well, where the non-struct type must be a supertype of the representation type (which is likely to be useless for records, so mainly for single-value structs). Then the struct type is considered a subtype of that non-struct type as well, assignable to it, and exposing the interface members of the non-struct type. If member resolution of a member invocation ends up with a non-struct-member signature, then we perform that invocation directly on the representation value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment