alanfo/go_generics_simplified.md

## go_generics_simplified.md

      
    Raw
  

              go_generics_simplified.md
            
          
    A simplified generics constraint system.

Introduction

Although the constraints system (contracts) in the draft generics paper is very powerful and can deal with virtually anything (albeit sometimes in a circuitous fashion), many commentators feel that it is not a good fit for a simple language such as Go and that it may prove difficult for the community at large to write, read and use contracts in practice. Although a tool could be provided to help deal with writing contracts, this still won't help with reading and understanding them.
I posted an earlier proposal for dealing with the perceived shortcomings of contracts here. For convenience some of what I said in my earlier paper is repeated here.
However, I have come to the conclusion that even this proposal is too complicated and what we really need is something simple which everybody can understand and use but which can still cope with nearly all the cases where a constraint is needed. This, I feel, would be more in tune with the ethos of the language.
The current revision takes into account the comments and criticisms made on the original draft either here or in this thread on golang-nuts. My thanks to all who have commented as I believe this has led to a much improved proposal.
Suggested solution

As Go doesn't have operator overloading (and I've seen no indication that it may be added), I don't see why it is necessary to individually specify each operator that is used in the contract. Operators can for the most part only be applied to the built in types (and types based on them) and particular sets of operators only apply to certain subsets of those types. Similar remarks apply to conversions.
I believe generic contracts would be much simpler to write and understand if there were a way to assert that a type parameter belonged to a certain group of types, failing which the code would not compile. The use of these type groups would mean that both the programmer and the compiler would know immediately what operators a type parameter supported, what conversions were possible and so forth.
My proposal is therefore that the language should have ten built-in contracts as follows:


Contract
Constituents


Comparable
any type which supports ==  and !=


Integer
any integer type


Float
any floating point type


Real
any integer or floating point type


Complex
any complex type


Boolean
any boolean type


String
any string type


Ordered
any integer, floating point or string type


Bytes
any string or byte slice type


Runes
any string or rune slice type


The above contracts include not just the relevant built-in types themselves but defined types which have the same underlying type.
As they could only be used in the same context as a user-defined contract, then they need only be reserved within that context - i.e. they could be used as normal identifers elsewhere without causing any confusion and would therefore be backwards compatible with Go 1.
If only a single type parameter required a constraint and it could be constrained by one of the built-in contracts, then this could be used without further ado.
In more complicated cases, you'd need to define your own contract rather than use the built-in ones. However, contracts would be limited to containing one or more of the following constructs:


An assertion that a type parameter (or its corresponding pointer type) implemented a particular interface.


An assertion that a type parameter or parameters also satisfied another contract (including a built-in one).


An assertion that a type parameter excluded certain built-in numeric types (including types based on them) which the contract would otherwise permit.


An assertion that a type parameter was a struct which contained certain fields.


A list of additional method signatures which the type parameter or parameters (or their corresponding pointer types) included in their method sets.


It would be a compile time error for any type parameter to be subject to more than one built-in contract (subject to the exceptions below), more than one omit assertion or for a built-in numeric type to be excluded if it had not been specifically included in the first place.


If a type parameter is subject to the Integer, Float, Real or Ordered contracts, then it may also be subject to the Complex contract. In such a case, the types supported by the former will be augmented by the two complex types.


If a type parameter is subject to the Integer contract, then it may also be subject to the String contract. In such a case, the types supported by the former will be augmented by the string type.


Notice that the Comparable contract is implicit to all the other built-in contracts apart from Bytes and Runes (slices, of course, are not comparable) and it would therefore be superfluous for a type parameter to be subject to both.
Notice also that any assertion which implies a type parameter represents a built-in type or has a particular method would disallow pointer or interface types being  substituted for that type parameter for the simple reason that such types can neither be built-in or have methods. However, they could of course still be represented indirectly using the above constructs. If one only wanted to use interface types, then there would be no need to use generics at all!
Purely for illustration purposes, here's an example of a contract which contains instances of all of these constructs and shows the proposed syntax for expressing them:
contract other(T) {
    T.StringMethod() string    // T has a method called StringMethod() which returns a string
}

contract examples(T, U) {
    Comparable(T)              // T satisfies the built-in Comparable contract
    Integer(U)                 // U satisfies the built-in Integer contract 
    omit{int8, uint8}(U)       // U excludes the int8 and uint8 types 
    fmt.Stringer(T)            // T implements the fmt.Stringer interface
    other(T)                   // T satisfies the *other* contract
    struct{Count int}(T)       // T is a struct with a Count field of type int
    T.ValueMethod(U) T         // T has a method, ValueMethod, which takes a U and returns a T
    (*T).PointerMethod() *T    // *T has a method, PointerMethod, which returns a *T
}
It would now be easy for contracts to express all types of methods and their return types (if any) including methods with variadic parameters.
Note that:


It is no longer necessary to declare variables for each type parameter.


The first six are assertions rather than conversions as they use a type parameter as an argument rather than an expression of that type. This special syntax would only be available within a contract body.


The struct assertion in particular wouldn't work at all as a conversion because the expression would need to have an identical underlying type for the conversion to succeed. As it stands the specified field(s) only need to be present and their order is immaterial. It's also possible to express that a type parameter can represent any struct by leaving the field list empty.


The third construct in the example is an omit assertion which is different to anything we have in the language at present. omit would be a new keyword but need only be so within a contract body (it could be used normally elsewhere) and so would still be backwards compatible with Go 1.


The last two constructs differ from current method expression syntax which would require instead:
T.ValueMethod(T, U) T
(*T).PointerMethod(*T) *T
However, the problem with using the existing syntax is that:


The type parameter would have to be repeated as the first (or only) method parameter.


It doesn't necessarily follow from the use of a pointer type that it's a method with a pointer receiver - it could still be a method with a value receiver.


The use of special syntax within (but only within) contracts avoids needless repetition and makes it clear which type of receiver is required.
In addition I would propose the following convenience feature:

If only a single type parameter requires a constraint and the constraint is an interface, then one could use the interface directly in the function/type definition rather than having to embed it in a separate contract. For example, if a sole type parameter were constrained by an interface I then it would behave exactly the same as if it were constrained by a contract temp defined as follows:

contract temp(T) {
    I(T)
}
Examples

Let's take those examples from the design paper which would change if this suggestion were implemented and see how they now look:
contract stringer(T) {
    T.String() string  // or alternatively fmt.Stringer(T)
}

func Stringify(type T stringer)(s []T) (ret []string) {
    for _, v := range s {
       ret = append(ret, v.String())
    }
    return ret
}

// Alternatively fmt.Stringer could be used directly.
func Stringify(type T fmt.Stringer)(s []T) (ret []string) { // ... }

contract PrintStringer(X) {
    stringer(X)  // embedded contract
    X.Print()
}
// No need for a *convertible* contract as only integer & float types can be converted to uint64.
func FormatUnsigned(type T Real)(v T) string {
    return strconv.FormatUint(uint64(v), 10)
}

s := FormatUnsigned('a')  // 'a' inferred to be rune which belongs to Real
contract Readable(T) {
   io.Reader(T)  // or use directly
}
contract viaStrings(To, From) {
    From.String() string
    To.Set(string)
}

func SetViaStrings(type To, From viaStrings)(s []From) []To {
    r := make([]To, len(s))
    for i, v := range s {
        r[i].Set(v.String())
    }
    return r
}
contract equal(T) {
   T.Equal(T)
}

func Index(type T equal)(s []T, e T) int {
    for i, v := range s {
        // Both e and v are type T, so it’s OK to call e.Equal(v).
        if e.Equal(v) {
            return i
        }
    }
    return -1
}
contract Graph(Node, Edge) {
    Node.Edges() []Edge
    Edge.Nodes() []Node
}

func ShortestPath(type N, E Graph)(src, dst N) []E
contract adjustable(T) {
    T.Adjust() T
    T.Apply()
}

func Apply(type T adjustable)(v T) {
    v.Adjust().Apply()
}
func Contains(type T Comparable)(s []T, e T) bool {
    for _, v := range s {
        if v == e {
           return true
        }
    }
    return false
}
contract convert(To, From) {
    Real(to)
    Real(from)
}

func Convert(type To, From convert)(from From) To {
    to := To(from)
    if From(to) != from {
        panic("conversion out of range")
    }
    return to
}
contract BiggerInts(T) {
    Integer(T)
    omit{int8, uint8}(T)
}

func Add1K(type T BiggerInts)(s []T) {
    for i, v := range s {
        s[i] = v + 1000
    }
}
func Join(type T Bytes)(a []T, sep T) (ret T) {
    if len(a) == 0 {
        // Use the result parameter as a zero value.
        return ret
    }
    if len(a) == 1 {
        return T(append([]byte(nil), []byte(a[0])...))
    }
    n := len(sep) * (len(a) - 1)
    for i := 0; i < len(a); i++ {
        n += len(a[i])
    }

    b := make([]byte, n)
    bp := copy(b, []byte(a[0]))
    for _, s := range a[1:] {
        bp += copy(b[bp:], []byte(sep))
        bp += copy(b[bp:], []byte(s))
    }
    return T(b)
}
contract counters(T1, T2) {
    // Assert that both types must have a Count field of type int.
    struct{Count int}(T1)
    struct{Count int}(T2)
}

func Corresponding(type T1, T2 counters)(p1 *T1, p2 *T2) {
    p1.Count = p2.Count
}
type orderedSlice(type Ele Ordered) []Ele

func (s orderedSlice(Ele)) Len() int           { return len(s) }
func (s orderedSlice(Ele)) Less(i, j int) bool { return s[i] < s[j] }
func (s orderedSlice(Ele)) Swap(i, j int)      { s[i], s[j] = s[j], s[i] }

// OrderedSlice sorts the slice s in ascending order.
// The elements of s must be ordered using the < operator.
func OrderedSlice(type Ele Ordered)(s []Ele) {
    sort.Sort(orderedSlice(Ele)(s))
}
func Keys(type K, V Comparable(K))(m map[K]V) []K {
    r := make([]K, 0, len(m))
    for k := range m {
        r = append(r, k)
    }
    return r
}

k := maps.Keys(map[int]int{1:2, 2:4})
type Set(type Ele Comparable) map[Ele]struct{}

func Make(type Ele Comparable)() Set(Ele) {
    return make(Set(Ele))
}

// etc
type Metric1(type T Comparable) struct {
    mu sync.Mutex
    m  map[T]int
}

contract cmp2(T1, T2) {
    Comparable(T1)
    Comparable(T2)
}

type key2(type T1, T2 cmp2) {
    f1 T1
    f2 T2
} 

type Metric2(type T1, T2 cmp2) struct {
    mu sync.Mutex
    m  map[key2(T1, T2)]int
}

// etc
Implementation

As in the original proposal, contracts would be used to validate the actual type argument(s). The only difference is that, if a built-in contract were involved, the compiler would need to check that the actual type belonged to the associated group.
A possible strategy for checking the generic function/type itself in relation to the built-ins used in the contract might be for the compiler to have a default type for each fundamental group (integer, floating point, complex, boolean or string) contained within those built-ins. It could then check that the function/type compiled for the default type and repeat this process for each fundamental group involved.
If the fundamental type groups were chosen so that their members support exactly the same operations, conversions etc., it should then follow that if the function/method compiles for the default types, it will compile for any member type of the relevant groups.
In the case of numeric types, the default would need to be the smallest type in the group to avoid falling foul of rules which require an untyped constant to be representable (i.e. within the range of) a value of each numeric type.
Conclusion

So, in a nutshell, what I'm suggesting here is that contracts in the original design should be refashioned so that they are much simpler and more clearly delineated. In particular, ten new built-in identifiers should be introduced to check, where appropriate, that a type belongs to a particular group for which certain operations, conversions, assignments etc. are part of the language.
Admittedly, there is significant syntactic overhead compared to the draft design. However, in my view, this is not unacceptable for an important feature such as generics, is contextual in any case and still backwards compatible with Go 1.
But, most importantly, I think it would represent a massive simplification in the writing and understanding of constraints on generic type parameters. It might also lead to easier and quicker compilation as the compiler would be dealing with stuff it already knows about.
There are no doubt cases which the draft design (which is very powerful) can cater for but which this proposal cannot. However, in my view, a slight loss in power would be a worthwhile price to pay for a more clearly defined solution and, given what other commentators have said on this topic, I'm certainly not alone in thinking this.
APPENDIX

Analysis of built-in contracts


Operator group
Constituents
Assignment form op=


plus
+
yes


arithmetic
+ - * /
yes


remainder
%
yes


shift
<< >>
yes


bitwise
&
^


unary
+ -
no


complement
^
no


equality
== !=
no


comparison
== != < <= > >=
no


increment
++ --
no


logical
&& || !
no


indexation
[]
no


length
len()
no


All members of each contract support the following. Note that members for this purpose includes user-defined types with the same underlying type as a listed member.
Comparable

Members:

Any type that supports the equality operators.

Operators:

equality

Integer

Members:

int, int8(byte), int16, int32(rune), int64, uint, uint8, uint16, uint32, uint64, uintptr

Operators:

arithmetic
remainder
shift (right operand must be unsigned)
bitwise
unary
complement
comparison
increment

Conversions:

to/from integer
to/from float
to string

Implements:

Comparable
Ordered

Float

Members:

float32, float64

Operators:

arithmetic
unary (effectively)
comparison
increment

Conversions:

to/from integer
to/from float
to complex (via complex built-in)
from complex (via real/imag built-ins)

Implements:

Comparable
Ordered

Real

Members:

int, int8(byte), int16, int32(rune), int64, uint, uint8, uint16, uint32, uint64, uintptr
float32, float64

Operators:

arithmetic
unary (effectively)
comparison
increment

Conversions:

to/from integer
to/from float

Implements:

Comparable
Ordered

Complex

Members:

complex64, complex128

Operators:

arithmetic
unary (effectively)
equality
increment

Conversions:

to/from complex
to float (via real/imag built-ins)
from float (via complex built-in)

Implements:

Comparable

Boolean

Members:

bool

Operators:

equality
logical

Conversions:

to/from bool

Implements:

Comparable

String

Members:

string

Operators:

comparison
plus (concatenation)

Conversions:

to/from string
from integer

Implements:

Comparable
Ordered

Bytes

Members:

string
[]byte

Operators:

indexation
length

Conversions:

to/from string
to/from []byte

Runes

Members:

string
[]rune

Operators:

indexation
length

Conversions:

to/from string
to/from []rune

Ordered

Members:

int, int8(byte), int16, int32(rune), int64, uint, uint8, uint16, uint32, uint64, uintptr
float32, float64
string

Operators:

comparison
plus

Implements:

Comparable

Augmented contracts

There are also five augmented contracts whose members are the union of the following built-in types:

Integer or Complex (NonFloat)
Float or Complex (NonInteger)
Real or Complex (Numeric)
Ordered or Complex (Addable)
Integer or String  (Codepoint)

The names in parentheses are suggested names for these contracts.
As a general rule, all of these contracts are comparable but no conversions are permitted.
The exception is Codepoint which is also orderable and supports the 'integer to string' conversion.
The first three support the same operators as the Complex contract, the fourth supports the equality and plus operators and the fifth supports the comparison and plus operators.
They are not built-in themselves but can be constructed like this:
contract NonFloat(T) {
    Integer(T)
    Complex(T)
}
contract NonInteger(T) {
    Float(T)
    Complex(T)
}
contract Numeric(T) {
    Real(T)
    Complex(T)
}
contract Addable(T) {
    Ordered(T)
    Complex(T)
}
contract Codepoint(T) {
    Integer(T)
    String(T)
}
Notes on choice of built-in contracts

Some commentators have tried to use built-ins based purely on the operators they permit. However, as demonstrated in the previous section, there are a lot of operator groups to cater for, they don't say anything about allowable conversions and some types which support common operators don't play well together in other ways.
In a strongly typed, imperative language such as Go I think people tend to think more in terms of types than operations, so I decided that the built-ins should do likewise. I also found that defining them in this way simplified a number of other considerations and enabled me to restrict the number of built-ins to ten easily understood contracts.
Beginning their identifiers with a capital letter should help to reduce clashes with other uses of these names and, in the case of Real, Complex and String, distinguishes between existing built-ins (all lower case) with otherwise the same names.
The following is a summary of my reasons for the inclusion or exclusion of various potential built-in contracts within this simplified proposal.
Comparable, Integer, Real and Ordered

Clearly, these have to be catered for in any sensible constraint system.
Float

I considered leaving this out when there is already Real.
However, I decided this would be a mistake. Floating point types have a number of properties which integer types don't have - they don't overflow, don't truncate division, permit division by zero and support weird stuff like infinities and NaNs.
I think people will definitely want to write functions which are restricted to floating point types and Float should therefore be included.
Complex

I also considered leaving this out and having instead a Numeric built-in contract which covered integer, floating point and complex types.
A Numeric built-in would have some desirable properties (all the arithmetic operators would be supported). However, when it comes to conversions, complex and integer types don't play well together and, although complex is essentially a tuple of float, you have to use the complex, real and imag built-in functions to convert between them. Also complex types are not ordered.
Complex types have their own package in the standard library and are generally segregated from the other numeric types. I think it is best to continue this line of thinking though a Complex built-in is still needed to generify the two complex types.
However, for those who disagree, it is still possible to construct a Numeric contract (and 3 others involving complex types) using the augmented contract mechanism described earlier.
Boolean and String

These were excluded from my first draft. My grounds were:
"They would only represent one basic type, bool and string respectively, and for nearly all practical purposes you can just use those types directly.
Where you have a defined boolean or string type, then all you need to do is to convert to/from the basic types when using the function."
However, it was pointed out that this would not be a good solution where a generic type or function included arrays, slices or maps of defined types based on these types as you would need to convert their elements to their underlying type before you could do anything with them.
I have therefore relented on this and they are now included in the proposal.
In the case of String, it is also possible (with Integer) to construct a Codepoint contract using the augmented contract mechanism described earlier. This could unify dealing with a single Unicode codepoint by making use of the conversion from any integer type to a string.
Unsigned and Signed

I decided to exclude these.
Unsigned would only really be useful in the case of the shift operators where the right operand has to be an unsigned integer. However, it is easy enough to convert between signed and unsigned integers so I don't think this is an important enough use case to justify its inclusion.
It's difficult to think of situations where you'd want to exclude the unsigned types completely so I concluded that supporting Signed would be virtually useless.
However, it is worth noting that both of these can now be easily constructed under the present proposal:
contract Unsigned(T) {
    Integer(T)
    omit{int8, int16, int32, int64, int}(T)
}

contract Signed(T) {
    Integer(T)
    omit{uint8, uint16, uint32, uint64, uint, uintptr}(T)
}
Indexable (by integer index)

This seems attractive at first sight as you would be able to do index operations and apply the len() function directly to a type parameter.
However, the problem is that the types it would support - strings, arrays and slices - are not very homogeneous:


Strings are immutable, comparable and ordered, support len but not cap.


Arrays are mutable (though not in length), comparable and support len and cap.


Slices are mutable (including in length), support len and cap but are not comparable and support nil.


Also it is difficult to do anything generically with arrays and slices without knowing the element type.
For these reasons, I think it's better to confine the built-in contracts to scalar types as you can still create any composite type (including maps, channels, functions and pointers) easily enough if its constituent types are (or include) type parameters and then operate on it accordingly.
Bytes (string or []byte types) and Runes (string or []rune types)

Although these two types are inter-convertible, they still differ in important ways as was illustrated in the Indexable section. Consequently, I omitted them from my original draft.
However, it was pointed out that they would nevertheless be useful in writing functions which unified strings and their slice equivalents and so I have decided now to include them.
Nilable

There are six kinds of types which support the nil built-in, namely: slices, maps, functions, pointers, channels and interfaces. All of these are comparable to nil but only the last three satisfy the Comparable contract.
It is difficult to think of any cicrumstances where one would want a type parameter to be constrained to being one of these types and so I have excluded Nilable from the proposal.
Other considerations

These include my thoughts on problems raised in the draft design and overview papers themselves as well as issues arising from this proposal.
Use of struct assertion in contracts

It is questionable whether we really need this as there are ways around it such as defining accessor functions for the fields in question which can then be catered for using an interface.
However, some commentators feel it would be valuable and have produced examples to demonstate it. As it's easy enough to come up with a suitable syntax and probably not too difficult for the compiler to check it, I decided to include it in this proposal.
Zero value of a type parameter

Although there are ways around it, I think it would be useful to introduce specific syntax for this, my own preference being for T{}.
Automatic compiler inference that a map's key type parameter is Comparable

Although attractive at first sight, I think it might prove confusing in practice if the Comparable constraint is insufficent for describing the key type and/or the value type also needs to be constrained in some way.
On balance, I think it's best to express this explicitly.
Allowing type parameters to satisfy different contracts in the function/type definition

Again attractive at first sight but likely to prove confusing in practice. So I feel it's better to just allow one contract here even if it needs to be user-defined.
Untyped constants

The use of untyped constants in generic expressions or their assigment to generic variables should no longer present any particular problems under this proposal.
However, you'd need to ensure that the constant's default type was consistent with any of the types which the generic expression or variable could represent, otherwise the code wouldn't compile.
Should generic functions be allowed to have their own type parameters?

My view is that they should not for the reasons given in the draft paper.
It can always be worked around using a top-level function which may be clearer in any case.
Type parameter embedded as a field in a generic struct

My view is that this should be permitted but the name of the field should always be the (original) type parameter used in its definition even if the struct has a method which uses a different name for the type parameter.
I think that using the different name to refer to that field within the method would be too confusing and, if the different name happened to coincide with the name of another field of the same struct, then it would of course be ambiguous
Type assertions and switches

The draft design proposes to extend these to checking type parameters as well as interfaces. I think this would be a good idea though, in the case of type parameters, there might be a case for checking that the instantiated type is not just the specified type but any type with the same underlying type.
Should generic interfaces be permitted?

I couldn't find any reference to generic interfaces (i.e. interfaces with type parameters) in the draft design or overview papers but it subsequently became clear from this thread on golang-nuts that it is in fact intended to support them.
Although they may involve some complexity and possible ambiguities for the compiler to grapple with, my personal view is that generic interfaces would be a worthwhile feature if these difficulties can be satisfactorily dealt with.
Infinite or arbitrary recursion within generic types or functions

I feel the guiding principle here is whether such a type or function would compile if the type parameter(s) were replaced by concrete type(s) satisfying the contract (if any). If it would then the code should be allowed but not otherwise.
I know that currently the compiler can detect infinite recursion within a struct or interface (and issue an invalid recursive type error) but I'm not aware of a similar test for functions.
Subject to that, I'd have thought that the compiler should use the same considerations to determine whether implementation can be left until runtime as it does for non-recursive cases.
Incorporation of built-in contracts into a standard contracts library

It might be useful for a standard contracts library with a nice short name (say "ct") to be introduced so that people don't have to continually create the same commonly used contracts. Possible candidates for such a library include the Unsigned, Signed, Numeric, NonFloat, NonInteger, Addable and Codepoint contracts mentioned earlier.
It would even be possible to have no built-in contracts at all and to place all of them in the standard package instead even though their underlying implementation would be via compiler magic.
A disadvantage of doing so is that you would need to import "ct" every time you wanted to write generic code and to prefix the names of the standard contracts with the package name (though . could be used in the import declaration to avoid this if you were confident there'd be no clashes).
However, any one who wants to use generics would still need to learn which types the standard contracts encompassed though they would be documented in the "ct" package rather than the specification itself.
Extensibility

Should it be found in practice that additional considerations ought to be catered for despite the additional complexity, there is no reason why additional built-in contracts and/or contract body assertions couldn't be created to deal with them.
What would happen if a future version of Go introduced operator overloading?

This proposal would continue to work as the built-in types don't have methods. If, as I suspect, the use of operator overloading would be largely confined to mathematical objects (vectors, matrices etc.), the majority of Go programmers wouldn't use it anyway.
For those who would like to write generic functions for all types which had, say, a + operator, some way would need to be found of expressing that within a contract or interface. This would probably require + (and other operators) being regarded as pseudo-methods of the relevant built-in types and this proposal could therefore easily encompass it. For example (using an imagined syntax):
contract Plus(T) {
    T.Op_plus(T) T
}
Contract	Constituents
Comparable	any type which supports == and !=
Integer	any integer type
Float	any floating point type
Real	any integer or floating point type
Complex	any complex type
Boolean	any boolean type
String	any string type
Ordered	any integer, floating point or string type
Bytes	any string or byte slice type
Runes	any string or rune slice type
Operator group	Constituents	Assignment form op=
plus	+	yes
arithmetic	+ - * /	yes
remainder	%	yes
shift	<< >>	yes
bitwise	&	^
unary	+ -	no
complement	^	no
equality	== !=	no
comparison	== != < <= > >=	no
increment	++ --	no
logical	&& \|\| !	no
indexation	[]	no
length	len()	no