Skip to content

Instantly share code, notes, and snippets.

@rogpeppe
Last active September 13, 2018 22:29
Show Gist options
  • Save rogpeppe/45f5a7578507989ec4ba5ac639ae2c69 to your computer and use it in GitHub Desktop.
Save rogpeppe/45f5a7578507989ec4ba5ac639ae2c69 to your computer and use it in GitHub Desktop.

Revised generics design

Roger Peppe, 2018-09-01

One common response to the new generics design has been that contracts seem a bit too much like interfaces but not the same.

They are indeed very similar if you squint a little. Consider these two definitions:

type IReader interface {
	Read([]byte)(int, error)
}

contract CReader(r R) {
	interface {
		Read([]byte)(int, error)
	}(r)
}

Both IReader and CReader define a kind of filter on types. They both allow exactly the same set of types, but IReader can only be used in a function parameter list, and CReader can only be used in a type parameter list.

I propose that we unify the concepts of contracts and interfaces while still keeping much of the originally designed syntax.

Proposed rule:

Any contract with exactly one type parameter and a body consisting entirely of type conversions to interface types is considered to be equivalent to an interface that embeds those interfaces.

This allows any interface type to be used in place of a contract, and a restricted subset of all contracts to be used as interfaces.

For example, this code becomes valid:

func ReadAll(type R io.Reader)(r R) ([]byte, error)

It feels like this invites us to adjust the syntax a little. Because a contract is now just a type (albeit a type that we can't always use in function parameters), let's use a type statement to define one.

type Convertible = contract(_ To, f From) {
	To(f)
}

So a contract type is syntactically like a function literal, except that it can only be used in type context, not in value context.

In the original proposal, any set of type parameters can have at most one contract, so if you need two type parameters that fulfil independent contracts, you need to define a single contract that contains both of them. Now that interfaces can act as contracts, that restriction seems even less palatable. If a function takes two unrelated type parameters, it doesn't seem quite right that we should need to define a new type to hold them both at once.

So I propose that for contracts involving two or more type parameters, the types must be parenthesized in the type parameter list.

For example:

type G = contract(n Node, e Edge) {
	var _ []Edge = n.Edges()
	var from, to Node = e.Nodes()
}

type Graph(type (Node, Edge) G) struct { ... }
func New(type (Node, Edge) G)(nodes []Node) *Graph(Node, Edge) { ... }

This means we can now use any number of contracts in a type parameter list. For example:

func DrawGraph(type (Node, Edge) G, I draw.Image)

With the new syntax comes the possibility that a contract itself might be able to refer to external type parameters. This is explicitly disallowed. A contract may not refer to any generic types not defined in the contract's parameter list.

type C(T) contract(x X) {
	T(x)		// INVALID
}

Contract bodies

From the proposal:

The simplest way to ensure that a function only performs the operations permitted by its contract is to simply copy the function body into the contract body. In other words, to make the function body be its own contract, much as C++ does. If people take this path, then this design in effect creates a lot of additional complexity for no benefit.

We think this is unlikely because we believe that most people will not write generic function, and we believe that most generic functions will have only non-existent or trivial requirements on their type parameters.

I propose that instead of allowing arbitrary code in contract bodies, we allow only an extremely restricted syntax. Every statement in a contract body must consist of exactly one expression that uses exactly one of the following operators:

all arithmetic and logical binary and unary operators
channel send, receive and close
array and map indexing
function call
type conversion
interface conversion

Every expression must reference at least one of the contract's parameters. Only externally defined type names may be referenced, not names of functions or variables. The restriction that only externally defined names may be used is dropped.

Note that this form of contract prohibits many of the constructions that were previously used. Notably:

  • no variable declarations
  • no field or method references
  • no range statements

Even with these restrictions, the syntax is powerful enough to express all the examples in the original proposal except for the counter example, which is explicitly about fields.

Some advantages that come from these restrictions:

  • one concept per line.

Each line specifies exactly one required behaviour. When there's a more complex relationship between types and methods, generic interfaces can pull their weight. This makes it easy to read contracts and see exactly what a contract requires or allows.

  • reorderable.

There is no ordering relationship between statements in the contract, so it's possible to reorder them arbitrarily without changing the meaning of the contract, making it possible to canonicalise and compare contracts easily.

  • no intermediate implied types.

With arbitrary contracts, it's possible to have many implied-but-unnamed types inside a contract. For example, consider this contract:

contract Indirect(a A, b B) {
	t1 := a.A()
	t2 := t1.M1()
	// any number...
	t3 := t2.M2()
	b = t2.M3()
}

In that code, all of t1, t2 and t3 have types implied by the values returned by a, t1, t2 etc. Any function using that contract can obtain instances of those types but not name them. This seems wrong to me, just as returning unexported types from exported methods is generally a bad idea. Also, the complexity of such a contract is hard to get one's head around!

  • Known cost of range

One thing that the original proposal mentioned was the possibility of using range on generic types. This means that you can have a range operator in a generic function but no idea whether it's ranging over a map, a channel or a slice, all of which have quite different properties. By eliminating range from the set of possible statements, this is no longer an issue.

  • No need for special rules about value methods

Since method calls are not allowed, there's no way that the Go implicit address-of rule comes into play, so there should be no need to have a special rule that says that value methods are allowed, so my suggestion in the Value Methods section above becomes redundant.

  • Straightforward runtime model

If we wish to generate the same code for a function instantiated with several different types, then there has to be some way to allow the function to manipulate these types. With the simpler form of contract, each statement of the contract can be implemented by a function that provides the behaviour for the statement's operation.

In a sense, each statement in the contract is similar to an entry point in an interface.

The lack of explicit field or method references mean that there's no need to deal with the ambiguity that field and method references in the original proposal lead to. In contracts as originally proposed, there's no way to tell if we're referring to a method or a field.

@stevenblenkinsop
Copy link

@rogpeppe wrote:

This is the same restriction we have on interface values now, and it seems to work out OK. It would help to have some concrete examples where this would be useful, but I'm struggling to come up with any. Can you think of one?

It seems like the use case would be very narrow: you want to be able to rely on value semantics (i.e. assignment creates an independent copy of a value, which doesn't work if there's any indirection) but also need mutating operations (i.e. operations that take a pointer receiver). However, in that case, you don't need to be able to accept pointer types as the argument type, anyways, so it's fine that the method set of *T where T is the type parameter will be empty for pointer argument types.

Also, there's no way to require value semantics for the argument type anyways; a type can always include a pointer. If you need types to support copying behaviour, you'd probably need to specify a method for it.

The reason I brought it up was because one of the underlying assumptions of the original contracts proposal is that generic code should be written in the same way as non-generic code as much as possible. Of course, we've seen all the ways that creates complications, as the various conveniences provided to non-generic code just make it harder to reason about what code is doing in a generic context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment