Roger Peppe, 2018-09-01
One common response to the new generics design has been that contracts seem a bit too much like interfaces but not the same.
They are indeed very similar if you squint a little. Consider these two definitions:
type IReader interface {
Read([]byte)(int, error)
}
contract CReader(r R) {
interface {
Read([]byte)(int, error)
}(r)
}
Both IReader and CReader define a kind of filter on types. They both allow exactly the same set of types, but IReader can only be used in a function parameter list, and CReader can only be used in a type parameter list.
I propose that we unify the concepts of contracts and interfaces while still keeping much of the originally designed syntax.
Proposed rule:
Any contract with exactly one type parameter and a body consisting entirely of type conversions to interface types is considered to be equivalent to an interface that embeds those interfaces.
This allows any interface type to be used in place of a contract, and a restricted subset of all contracts to be used as interfaces.
For example, this code becomes valid:
func ReadAll(type R io.Reader)(r R) ([]byte, error)
It feels like this invites us to adjust the syntax a little. Because a contract is now just a type (albeit a type that we can't always use in function parameters), let's use a type statement to define one.
type Convertible = contract(_ To, f From) {
To(f)
}
So a contract type is syntactically like a function literal, except that it can only be used in type context, not in value context.
In the original proposal, any set of type parameters can have at most one contract, so if you need two type parameters that fulfil independent contracts, you need to define a single contract that contains both of them. Now that interfaces can act as contracts, that restriction seems even less palatable. If a function takes two unrelated type parameters, it doesn't seem quite right that we should need to define a new type to hold them both at once.
So I propose that for contracts involving two or more type parameters, the types must be parenthesized in the type parameter list.
For example:
type G = contract(n Node, e Edge) {
var _ []Edge = n.Edges()
var from, to Node = e.Nodes()
}
type Graph(type (Node, Edge) G) struct { ... }
func New(type (Node, Edge) G)(nodes []Node) *Graph(Node, Edge) { ... }
This means we can now use any number of contracts in a type parameter list. For example:
func DrawGraph(type (Node, Edge) G, I draw.Image)
With the new syntax comes the possibility that a contract itself might be able to refer to external type parameters. This is explicitly disallowed. A contract may not refer to any generic types not defined in the contract's parameter list.
type C(T) contract(x X) {
T(x) // INVALID
}
From the proposal:
The simplest way to ensure that a function only performs the operations permitted by its contract is to simply copy the function body into the contract body. In other words, to make the function body be its own contract, much as C++ does. If people take this path, then this design in effect creates a lot of additional complexity for no benefit.
We think this is unlikely because we believe that most people will not write generic function, and we believe that most generic functions will have only non-existent or trivial requirements on their type parameters.
I propose that instead of allowing arbitrary code in contract bodies, we allow only an extremely restricted syntax. Every statement in a contract body must consist of exactly one expression that uses exactly one of the following operators:
all arithmetic and logical binary and unary operators
channel send, receive and close
array and map indexing
function call
type conversion
interface conversion
Every expression must reference at least one of the contract's parameters. Only externally defined type names may be referenced, not names of functions or variables. The restriction that only externally defined names may be used is dropped.
Note that this form of contract prohibits many of the constructions that were previously used. Notably:
- no variable declarations
- no field or method references
- no range statements
Even with these restrictions, the syntax is powerful enough to express all the examples in the original proposal except for the counter example, which is explicitly about fields.
Some advantages that come from these restrictions:
- one concept per line.
Each line specifies exactly one required behaviour. When there's a more complex relationship between types and methods, generic interfaces can pull their weight. This makes it easy to read contracts and see exactly what a contract requires or allows.
- reorderable.
There is no ordering relationship between statements in the contract, so it's possible to reorder them arbitrarily without changing the meaning of the contract, making it possible to canonicalise and compare contracts easily.
- no intermediate implied types.
With arbitrary contracts, it's possible to have many implied-but-unnamed types inside a contract. For example, consider this contract:
contract Indirect(a A, b B) {
t1 := a.A()
t2 := t1.M1()
// any number...
t3 := t2.M2()
b = t2.M3()
}
In that code, all of t1, t2 and t3 have types implied by the values returned by a, t1, t2 etc. Any function using that contract can obtain instances of those types but not name them. This seems wrong to me, just as returning unexported types from exported methods is generally a bad idea. Also, the complexity of such a contract is hard to get one's head around!
- Known cost of range
One thing that the original proposal mentioned was the possibility of using range
on generic types. This means that you can have a range operator in a generic function but no idea whether it's ranging over a map, a channel or a slice, all of which have quite different properties. By eliminating range from the set of possible statements, this is no longer an issue.
- No need for special rules about value methods
Since method calls are not allowed, there's no way that the Go implicit address-of rule comes into play, so there should be no need to have a special rule that says that value methods are allowed, so my suggestion in the Value Methods section above becomes redundant.
- Straightforward runtime model
If we wish to generate the same code for a function instantiated with several different types, then there has to be some way to allow the function to manipulate these types. With the simpler form of contract, each statement of the contract can be implemented by a function that provides the behaviour for the statement's operation.
In a sense, each statement in the contract is similar to an entry point in an interface.
The lack of explicit field or method references mean that there's no need to deal with the ambiguity that field and method references in the original proposal lead to. In contracts as originally proposed, there's no way to tell if we're referring to a method or a field.
Perhaps
<-=
is a typo? What about the other bit-wise operators (you already defined^
) that will allow defining contracts used in size-agnostic integer algorithms, such as<<
and&^
? What about the other Boolean operators, such as||
? We can't expect that programmers should remember which operators are accepted in the contract mini-language and which are not, but we can reasonably expect them to remember the kinds of expressions accepted ("operators" and type and interface conversions).Is assignment
=
provided to test assignability? If so, withoutvar
in a contract, it's unidirectional: you can have a contract line likex = MyByteSlice(nil)
(requiring that x is eitherMyByteSlice
,[]byte
, an interface, or a type alias to one of those), but you can no longer specify the equivalent tovar _ []byte = x
(requiring that x is anything with an underlying type of[]byte
, an interface, or a type alias to one of those). In other words, assignability rules are not commutative, and providing=
withoutvar
means only partial coverage of assignability, unless you allow something like:Therefore, I'd suggest getting rid of
=
or adding backvar
. The only other statement besides=
would have been(Y)(nil) <- x
, which is important for specifying in a contract thatY
can be sent to, which one can't assume since the language haschan<- int
. Assigning to an index can also be used to distinguish between a string and a slice (or a mutable slice and an immutable slice/array, if those ever come along).In the absence of
=
we would have conversions, but those can be a bit blunt, and for better or worse, don't allow a contract to specify cheapness. For example,y = []X(nil)
allows for algorithms that require only a type re-representation at compile time, but no copy at runtime due to identical memory layouts. Without assignment, the best that can be specified isY(x)
, which would in a copy at runtime if exactly one of Y or X is a string, and the algorithm using that contract needs a value of type Y from an input of type X.Either way, it would be possible to define a canonical order of statements for a given contract using some of the same techniques that rsc.io/grind uses. That said, we don't canonicalise interface method order (should we?). The closest we've got to that is import re-ordering.
Generic range algorithms will likely fall into two main categories: 1) utility algorithms, for which neither the element type nor the collection type is particularly important, and 2) those algorithms for which the collection type is assumed and only the element type can vary. I imagine most such uses of range in a contract will be in service of category #2. Even with range contracts, there are limitations in what can be expressed, so a "Values" function that takes a range-capable collection and returns the values as a slice could not possibly operate generically across channels and maps/slices. Conversely, an "Keys" function could operate on all three kinds, but it would actually be returning channel values. Both functions would be unusually expensive for slices, and both functions would use the wrong allocation hint if it used
make([]Y, 0, len(x))
on a channel (and would of course either OOM or block forever on a channel that is never closed).