Roger Peppe 2018/08/30
Here are a few thoughts after looking through the generics document. In general, I really like it. In particular I like the fact that it admits general operations on generic types through an intuitive contract specification that looks just like regular Go.
The (type X)
syntax for type parameters is also very pleasing to me - it
seems a perfect fit for Go. No new tokens, only a small amount of new
grammar, and readable to boot.
The body of a contract may not refer to any name defined in the current package.
This seems a bit arbitrary. I'd like to see more about the rationale for this.
Presumably the compiler will not produce an error when a variable defined in a contract is not used for anything, but this isn't stated anywhere.
As a set of type parameters can only have a single contract, there might turn out to be a proliferation of contracts, as every possible combination of types-with-contracts that are used as parameters needs to be given a name. Also, when contracts are combined in this way, it might not be so clear which potentially independent contract applies to which type, which might harm readability.
Would it be good practice for the contracts used by an exported generic function to be exported themselves?
The rule about contracts not being able to refer to any name defined in the current
package seems to make the contract embedding example in the proposal invalid,
because stringer
is part of the current package. ISTM that the current proposal
implies that it's not possible to have a contract that embeds a contract that's defined in
the same package, which is probably not the intention.
A contract introduces a global identifier. If the restriction on all names within a contract being defined in an external package was lifted, then there would be the possibility of having contracts that self-reference each other. Is that OK?
The proposal supplies this example:
contract adjustable(x T) {
var _ T = x.Adjust()
T.Apply(x)
}
func Apply(type T adjustable)(v T) {
v.Adjust().Apply() // INVALID
}
The stated rule is:
The rule is that if the contract body contains a method call on a non-addressable value, then the function body may call the method on a non-addressable value.
I'd like to see more rationale for this rule. It seems more restrictive than necessary.
For example, this rule would seem to mean that the stringer
contract defined elsewhere in the proposal:
contract stringer(x T) {
var s string = x.String()
}
could not be used on any value type, so:
days := []time.Weekday{time.Monday, time.Wednesday}
Stringify(days)
would not work because String is a value method on time.Weekday, and that's precluded by the stringer contract.
Furthermore, the rule seems to preclude a number of useful and common Go idioms where reference-methods are called on values.
For example, it's common to wrap a pointer type in a struct to adjust its method set:
type adaptor struct {
other *otherType
}
func (a adaptor) Apply() {
a.other.ApplyOther()
}
Under the non-addressable-value rule above, we wouldn't be able to use adaptor
as a value that satisfies a method call in a contract unless that it's explicitly allowed as a value type in the contract.
This also applies to interfaces. For the purposes of the contract rule, an interface type is a value type, so this "obviously OK" code would not work:
func StringOf(type S stringer)(s S) string {
return s.String()
}
func Foo(b fmt.Stringer) {
StringOf(b)
}
When would one not wish to allow a value method on a type?
It seems to me that the problem is muddied by Go's automatic address-taking behaviour. So in the stringer contract above, we don't know if we're calling a method on T
or on *T
because the variable x is addressable. I'm not convinced that Go's usual behaviour in this respect is useful in a contract. Suppose we said that within the body of a contract, we do not take the address of values automatically.
With this change in the rules, the stringer contract now states that T itself must implement a String method; a pointer method can only be used if a pointer value is used for T.
The down side of this is that now a subtly different set of semantics apply to the code in the body of a contract. It is no longer possible to just copy the body of a method into the body of a contract.
We do not propose to change the reflect package in any way. When a type or function is instantiated, all of the type parameters will become ordinary non-generic types. The String method of a reflect.Type value of an instantiated type will return the name with the type arguments in parentheses. For example, List(int).
I suspect that more will need to be done here. For example, if we have:
type OrderedMap(type k, v mapTypes) struct { ... }
I'm writing reflection code that
is looking for OrderedMap(k, v)
for some types k and v, it would be useful to be able to
find out those type parameters, just as I can see map[k]v
and find its key and value types.
If a contract body uses a short declaration, such as
s := x.String()
this does not provide any information about the result parameter of the String method.
It does provide information about the fact that there is exactly one result parameter.
This doesn't seem to be called out as such, but it seems that you'd be able to have type-parameterised interfaces too. For example, is this OK?
type Copier(type T) interface {
Copy() T
}
type foo struct {
s string
}
func (f *foo) Copy() *foo {
f1 := *f
return &f1
}
func copySlice(type T Copier)(xs []T) []T {
xs1 := make([]T, len(xs))
for i, x := range xs {
x1s[i] = x.Copy()
}
return xs1
}
It has been suggested that some kind of contract simplification might be possible. A possible naive algorithm for doing this (due to Ian Taylor I believe):
- delete statements as long as the set of types satisfying the contract remains the same
This assumes that the set of types is the important thing about a contract. Consider this example:
-- main.go --
package c
import (
"x"
"y"
)
contract C(t T) {
x.X(t)
y.Y(t)
}
func F(type T C)(t T) {
x.X(t)
y.Y(t)
}
func main() {}
-- x/x.go --
package x
func X(int) {}
-- y/y.go --
package y
func Y(int) {}
If we apply the naive simplification algorithm, the contract could be simplified to:
contract C(t T) {
x.X(t)
}
because both x.X and y.Y take the same argument type. But if we've deleted the call to Y in the contract and y.Y changes its signature, then the code breaks, even though we originally tried to state the full requirements in the contract.
That applies even more so to further simplifications we might consider, such as:
contract C(t T) {
var _ int = t
}
I'd really like to see a better definition of the exact rules implied by a given contract. If we have that, then we can start to decide on some laws for contract equivalence.
This is my biggest issue with the generics proposal as it stands: I don't understand how to properly reason about contracts, and I don't understand the theoretical complexity of getting programs to reason about them.
The above section on contract simplification also leads to the question of how to define backward compatibility between contracts. Backward compatibility is crucial in Go, so this really needs to be well defined. It seems like it's not possible to define any backward compatible change path with contracts, because anything added to the contract means callers may not provide the newly required feature; anything removed means generic functions using the removed feature will no longer work.
This makes it even more crucial that contract equality is well defined, because without such a definition, there's no clear way of knowing whether a seemingly trivial code change (for example a re-ordering of lines or a variable rename) will lead to a backwardly incompatible package change.
We believe that this design permits different implementation choices. Code may be compiled separately for each set of type arguments, or it may be compiled as though each type argument is handled similarly to an interface type with method calls, or there may be some combination of the two.
I would love to see an expansion of this. There's a great variety of operations that may be performed on generic types, not just method calls. When code is not compiled separately for each set of type arguments, how might that work? How do we enumerate all the possible operations that a set of contracts allows?
Using Go syntax for contracts seems exactly the right approach, but allowing arbitrary Go code in there seems problematic. A much simpler subset of Go still allows a lot of useful and interesting contract specifications without introducing so much complexity.
What if contracts allowed only one expression per line, and only one operator in any expression?
-
type conversion
T(x)
-
arithmetic operators
x+x x-x !x x+0
-
indexing operators
x[0] x[y]
-
channel operators
<-x x <- 3 close(x)
Notably, with these restrictions, you can't declare or assign to variables and you can't use the .
operator to access fields or methods directly.
However the language is sufficient to write all the contracts defined in the proposal, with the exception of the explicitly field-oriented counter
contract. For reference, I've attached the simplified-contract equivalents at the end of this section.
Some advantages of the simplified contracts:
- one concept per line.
Each line specifies exactly one required behaviour. When there's a more complex relationship between types and methods, generic interfaces can pull their weight.
- reorderable.
There is no ordering relationship between statements in the contract, so it's possible to reorder them arbitrarily without changing the meaning of the contract.
- no intermediate implied types.
With arbitrary contracts, it's possible to have many implied-but-unnamed types inside a contract. For example, consider this contract:
contract Indirect(a A, b B) {
t1 := a.A()
t2 := t1.M1()
// any number...
t3 := t2.M2()
b = t2.M3()
}
In that code, all of t1, t2 and t3 have types implied by the values returned by a, t1, t2 etc. Any function using that contract can obtain instances of those types but not name them. This seems wrong to me, just as returning unexported types from exported methods is generally a bad idea. Also, the complexity of such a contract is hard to get one's head around!
- Known cost of range
One thing that the original proposal mentioned was the possibility of using range
on generic types. This means that you can have a range operator in a generic function but no idea whether it's ranging over a map, a channel or a slice, all of which have quite different properties. By eliminating range from the set of possible statements, this is no longer an issue.
- No need for special rules about value methods
Since method calls are not allowed, there's no way that the Go implicit address-of rule comes into play, so there should be no need to have a special rule that says that value methods are allowed, so my suggestion in the Value Methods section above becomes redundant.
contract stringer(x T) {
io.Stringer(x)
}
contract convertible(_ To, f From) {
To(f)
}
contract Readable(r T) {
io.Reader(r)
}
contract viaStrings(t To, f From) {
fmt.Stringer(f)
interface {
Set(string)
}(t)
}
contract PrintStringer(x X) {
stringer(X)
interface{
Print()
}(x)
}
type Equaler(type T) interface {
Equal(T) bool
}
contract equal(v T) {
Equaler(v)
}
type Node(type E) interface {
Edges() []E
}
type Edge(type N) interface {
Nodes() (from, to N)
}
contract G(n NodeT, e EdgeT) {
Node(EdgeT)(n)
Edge(NodeT)(e)
}
I'd argue that if you know
OrderedMap(k, v)
, you can look at its (presumed)Set(key k, val v)
method and get the instantiated parameter from that. If you don't know about it, knowingk,v
probably won't help you, because you don't know the semantics of how to use them.(And if the parameters are not reflected in the instantiated type, then they clearly don't matter).
You are correct.
FWIW, that's not really an algorithm, but mostly a description of the problem: What statements can be deleted, so that the set of types satisfying the contract remains the same? The algorithm would describe how to choose those statements. The naive algorithm would do it greedily, or randomly, or… And does it backtrack?
FWIW, as a naive starting point, I think goreduce already provides the needed pieces. And any way to improve on the algorithm for contracts can probably also be applied to goreduce itself.
"Laws for contract equivalence" to me seems to be basically equivalent to "laws for type equivalence of programs", which would be so much easier if Go had a formalized type theory… In general, that would make things like these so much easier, because you could actually enumerate the relevant cases and prove general statements about the interactions between, say, contracts and all the currently existing language constructs.