Skip to content

Instantly share code, notes, and snippets.

@rogpeppe
Last active September 14, 2018 07:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rogpeppe/7ea0cb6037aa520934257bf88a1012c5 to your computer and use it in GitHub Desktop.
Save rogpeppe/7ea0cb6037aa520934257bf88a1012c5 to your computer and use it in GitHub Desktop.
Go Contracts as type structs

Go contracts as type structs

Roger Peppe, 2018-09-05

In my previous post, I talked about an issue with a real world use case for contracts - the fact that all type parameters must be mentioned every time a contract is used in a definition. In this post, I introduce an idea for a possible way to fix this.

The most important feature of contracts, in my view, is that they bind together several types into one unified generic relationship. This means that we can define a function with several type parameters from a contract and we can be assured that they all work together as we expect.

In some sense, those type parameters really are bundled together by a contract. So... what if we changed contracts to be a little more similar to that familiar construct that bundles things together: a struct?

Here's the graph example from the design doc rewritten with this idea in mind. I've rewritten the contract to use interfaces conversions, because I think it's clearer and less ambiguous that way, but it's not crucial to this idea.

package graph

type Edger(Edge) interface {
    Edges() []Edge
}

type Noder(Node) interface {
    Nodes() (n1, n2 Node)
}

contract Contract(n Node, e Edge) {
    Edger(n)
    Noder(e)
}

type Graph(type G Contract) struct { ... }
func New(type G Contract)(nodes []G.Node) *Graph(G) { ... }
func (*Graph(G)) ShortestPath(from, to G.Node) []G.Edge { ... }

Instead of declaring all the type parameters positionally every time we use a contract, we now use a single identifier. This acts as a kind of "type struct" - we can select different types from it by name using the familiar . operator. It's as if the contract is a type that contains other types, which seems to me to fit nicely with its role as a "bundler" of several types and their relationship to one another.

Note that when a contract is defined, its parameters are a little different from function parameters - not only do the parameters need to be distinct, but the type names do too, so there's no potential for ambiguity here.

Using named rather than positional type parameters means that we don't need to remember which order the type parameters are in. Passing the contract around as a whole makes the code easier and shorter.

It has the additional advantage that there's now no syntactic need for type parameters to use only one contract.

What we haven't covered yet is how we might actually create an instance of G in the first place. We can use a similar syntax to struct literals, except here we're associating type members of the contract with their actual types.

g := graph.New(graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
})([]myNodeType{...})

We could even define this as its own type:

type myGraph = graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
}
g := graph.New(myGraph)([]myNodeType{...})

In cases where type unification fails (for example when we need to specify a type that's in a return parameter), this should allow us to avoid the burden of passing the types each time.

Let's rewrite the example from my previous post to use this new syntax.

func PrintVariousDetails(type Mgo SessionContract)(s Mgo.Session) {
	db := s.DB("mydb")
	PrintBobDetails(db)
}

func PrintBobDetails(type Mgo SessionContract)(db Mgo.Database) {
	iter := db.C("people").Find(bson.M{
		"name": "bob",
	})
	... etc
}

I hope we can agree that this seems quite a bit nicer than the original.

@PeterRK
Copy link

PeterRK commented Sep 9, 2018

I hope you can show all your opinions in one article.

@stevenblenkinsop
Copy link

stevenblenkinsop commented Sep 10, 2018

contract Contract(n N, e E) {
    Node(n)
    Edge(e)
}
...
func (*Graph(G)) ShortestPath(from, to G.Node) []G.Edge { ... }

It's not clear here where the type-member name comes from. In the contract, they have the names N and E, with Node and Edge being interface conformance constraints. Multiple types could conform to the same interface, though. Of course, it would be unfortunate to have to call the member types anything other than G.Node and G.Edge, but if you used those names for the input parameters, the names of the interfaces would be shadowed in the body of the contract, which is a hinderance.

One hazard around an approach like this is that contract instances don't have a nominal identity. You accommodated that in your example⁠—

type myGraph = graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
}
g := graph.New(myGraph)([]myNodeType{...})

—⁠by using alias syntax to avoid creating a nominal type, but because of the pervasiveness of "declared" types in Go, encouraging people to think of contract instances as a type like this might lead to misunderstanding. I suppose it would be easy enough just to specify that "declared" contract instances aren't allowed.

When I've toyed with generics proposals at various times, I've also stumbled across using explicit type bundles like this in order to organize things better. What I always concluded was that, while the benefits of named over positional parameters are nice, it might be too 'weird' compared to other languages' syntax for the comparable feature. This is why my feedback focused on introducing output types, like

contract SessionContract(s Session)(db Database, c Collection, query Query, it Iter) {
    SessionInterface(Database)(s)
    DatabaseInterface(Collection)(db)
    CollectionInterface(Query)(c)
    QueryInterface(Iter)(q)
    Iter(it)
}

func PrintBobDetails(type Session SessionContract)(db SessionContract.Database) {...}

Here, you can infer all the other types based on a Session, so using that as the input type parameter makes sense; you don't need an explicit contract instance type to act as your single input. And, in fact, this will often be the case in Go as long as methods cannot be polymorphic or overloaded, since a given concrete type can only satisfy a generic interface for one set of type arguments. You only need multiple input parameters if you have two or more families of types which don't appear in methods on each other's members.

One challenge here is if you don't actually need the Session type in this instance, you ought to be able to take the Database type as the parameter. It doesn't seem too problematic to require a distinct contract for this, though:

func PrintBobDetails(type Database DatabaseContract)(db SessionContract.Database) {...}

Coming up with syntax to make this easily composable is a challenge, though. Maybe something like:

contract SessionContract(s Session)(db Database, c Collection, query Query, it Iter) {
    SessionInterface(Database)(s)
    DatabaseContract(Database)(Collection, Query, Iter)
}

I suppose you could streamline that by leveraging contract embedding a bit further:

contract SessionContract(s Session)(db Database) {
    SessionInterface(Database)(s)
    DatabaseContract(Database) // the outputs of `DatabaseContract` are implied outputs of `SessionContract`.
}

This has the advantage that you don't have to manually feed through each output type, and the disadvantage that you don't get to manually feed through each output type, making it harder to resolve potential overlaps and less obvious what all the outputs of a contract are.

@rogpeppe
Copy link
Author

contract Contract(n N, e E) {
    Node(n)
    Edge(e)
}
...
func (*Graph(G)) ShortestPath(from, to G.Node) []G.Edge { ... }

It's not clear here where the type-member name comes from. In the contract, they have the names N and E, with Node and Edge being interface conformance constraints. Multiple types could conform to the same interface, though. Of course, it would be unfortunate to have to call the member types anything other than G.Node and G.Edge, but if you used those names for the input parameters, the names of the interfaces would be shadowed in the body of the contract, which is a hinderance.

Yup, that's wrong. I've updated the code to change it. It is a bit of a pity that the type names shadow the interface names. Maybe there's some way of changing the syntax so we can use dot-qualified names in a contract body, but I can't think of a good one right now.

One hazard around an approach like this is that contract instances don't have a nominal identity. You accommodated that in your example⁠—

type myGraph = graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
}
g := graph.New(myGraph)([]myNodeType{...})

—⁠by using alias syntax to avoid creating a nominal type, but because of the pervasiveness of "declared" types in Go, encouraging people to think of contract instances as a type like this might lead to misunderstanding. I suppose it would be easy enough just to specify that "declared" contract instances aren't allowed.

I think you're wondering what happens if you do:

var x myGraph

I think it might be OK to allow use of contract instances that have only a single type member, but I think it's easiest just to disallow it and say that contract instances can only be used for type definitions.

When I've toyed with generics proposals at various times, I've also stumbled across using explicit type bundles like this in order to organize things better. What I always concluded was that, while the benefits of named over positional parameters are nice, it might be too 'weird' compared to other languages' syntax for the comparable feature. This is why my feedback focused on introducing output types, like

contract SessionContract(s Session)(db Database, c Collection, query Query, it Iter) {
    SessionInterface(Database)(s)
    DatabaseInterface(Collection)(db)
    CollectionInterface(Query)(c)
    QueryInterface(Iter)(q)
    Iter(it)
}

func PrintBobDetails(type Session SessionContract)(db SessionContract.Database) {...}

Here, you can infer all the other types based on a Session, so using that as the input type parameter makes sense; you don't need an explicit contract instance type to act as your single input. And, in fact, this will often be the case in Go as long as methods cannot be polymorphic or overloaded, since a given concrete type can only satisfy a generic interface for one set of type arguments. You only need multiple input parameters if you have two or more families of types which don't appear in methods on each other's members.

I'm not sure I like the special-cased "input" and "output" distinction here. As I see it, the contract argument types are just related to one another - they're all output types in that sense. For example, in a graph with nodes and edges, which is superor?

As for being weird, it's all weird - this is Go and it's new territory. We still have the capability to pass individual, unrelated type parameters, but here we require people to bundle related types into a single thing. I'd like to try to explore how that might turn out in larger programs vs the unnamed argument list approach.

I guess it depends quite a bit on whether people end up making larger contracts containing quite a few types. Here it seemed quite easy to get into that kind of territory.

One thing that I think this design leads towards is the possibility of being able to add type parameters to contracts without breaking backward compatibility, but that needs further exploration and may not actually be possible in fact.

One challenge here is if you don't actually need the Session type in this instance, you ought to be able to take the Database type as the parameter.

I'm not sure what you're getting at here. Please explain the issue a little more so my small brain can cope :)

@stevenblenkinsop
Copy link

I think you're wondering what happens if you do:

var x myGraph

I'm more meaning that in

type myGraph1 graph.Contract(...) // no `=` sign
type myGraph2 graph.Contract(...) // no `=` sign

the fact that myGraph1 and myGraph2 are not identical to each other is kind of useless. It only really makes sense to alias contract instances. The reason this matters is that if you can declare (rather than alias) contract instances, then it's no longer possible to infer the unique contract instance based on the inputs to a function, since there can be more than one instance with the same underlying type. I suppose you could just specify that the inference algorithm always infers the literal contract instance since it doesn't really make a difference.

I'm not sure I like the special-cased "input" and "output" distinction here. As I see it, the contract argument types are just related to one another - they're all output types in that sense. For example, in a graph with nodes and edges, which is superor?

You could make them both inputs in that case. Or apply the contract to a Graph type and have the nodes and edges as outputs. Or just have a different contract depending on which way you need the type inference to work, like

contract NodeType(node Node) (edge Edge) { ... }
contract EdgeType(edge Edge) (node Node) { ... }

Since you're already defining generic interfaces for each, those could probably serve in place of these contracts, with the type parameters of the interface being the outputs of the corresponding contract.

I'm not sure what you're getting at here. Please explain the issue a little more so my small brain can cope :)

I mean that you can't infer the session type based in the database type. The type implication only goes one way, where you can be given a session type and can infer the database type based on it, but not vice versa. If your function accepts a database as its input (like PrintBobDetails does), but accepts a session type as its type parameter, you'll never be able to infer the type parameter representing the session type and it will have to be explicitly provided by the caller. But the body of the function doesn't actually use the session type anyways. I'm pointing out that the way around this is to have a distinct DatabaseContract which doesn't require a Session type in the first place.

@rogpeppe
Copy link
Author

I think you're wondering what happens if you do:

var x myGraph

I'm more meaning that in

type myGraph1 graph.Contract(...) // no `=` sign
type myGraph2 graph.Contract(...) // no `=` sign

the fact that myGraph1 and myGraph2 are not identical to each other is kind of useless. It only really makes sense to alias contract instances. The reason this matters is that if you can declare (rather than alias) contract instances, then it's no longer possible to infer the unique contract instance based on the inputs to a function, since there can be more than one instance with the same underlying type. I suppose you could just specify that the inference algorithm always infers the literal contract instance since it doesn't really make a difference.

This is perhaps similar to the fact that interface types get distinct identities when they don't technically need to. There's a Go issue for that, and I think I favour using structural equality rather than nominal equality both for contracts and interfaces.

I'm not sure I like the special-cased "input" and "output" distinction here. As I see it, the contract argument types are just related to one another - they're all output types in that sense. For example, in a graph with nodes and edges, which is superor?

You could make them both inputs in that case. Or apply the contract to a Graph type and have the nodes and edges as outputs.

That's a bit different - that means that you'd need an actual value of that type, I think, but that's not really necessary.
I'd like to see the actual code for it.

Or just have a different contract depending on which way you need the type inference to work, like

contract NodeType(node Node) (edge Edge) { ... }
contract EdgeType(edge Edge) (node Node) { ... }

Since you're already defining generic interfaces for each, those could probably serve in place of these contracts, with the type parameters of the interface being the outputs of the corresponding contract.

I'd like to see this spelled out in a bit more detail. What would the contract bodies look like in both those cases, and what would the ShortestPath definition look like?

I'm not sure what you're getting at here. Please explain the issue a little more so my small brain can cope :)

I mean that you can't infer the session type based in the database type. The type implication only goes one way, where you can be given a session type and can infer the database type based on it, but not vice versa. If your function accepts a database as its input (like PrintBobDetails does), but accepts a session type as its type parameter, you'll never be able to infer the type parameter representing the session type and it will have to be explicitly provided by the caller. But the body of the function doesn't actually use the session type anyways. I'm pointing out that the way around this is to have a distinct DatabaseContract which doesn't require a Session type in the first place.

Yes, you could have multiple layered contracts, each one adding a new type parameter. That's pretty much the same as it would be in the current proposal, I think, except at least the types would named, so the distinction between the 5-tuple type parameters and the 4-tuple type parameters is a little clearer.

@stevenblenkinsop
Copy link

stevenblenkinsop commented Sep 12, 2018

I'd like to see this spelled out in a bit more detail. What would the contract bodies look like in both those cases, and what would the ShortestPath definition look like?

type Edger(Edge) interface {
    Edges() []Edge
}

type Noder(Node) interface {
    Nodes() (n1, n2 Node)
}

contract NodeContract(node Node) (edge Edge) { 
    Edger(Edge)(node)
 }

contract EdgeContract(edge Edge) (node Node) {
    Noder(Node)(edge)
}

func ShortestPath(type Node NodeContract)(from, to Node) []NodeContract.Edge { ... }

/* or potentially */ 

func ShortestPath(type Node Edger)(from, to Node) []Edger.Edge { ... }

Yes, you could have multiple layered contracts, each one adding a new type parameter. That's pretty much the same as it would be in the current proposal, I think, except at least the types would named, so the distinction between the 5-tuple type parameters and the 4-tuple type parameters is a little clearer.

My point is that you only need to accept exactly one type parameter as an input and the rest would be outputs in your mgo example, so positional vs named parameters becomes moot:

func PrintVariousDetails(type Session SessionContract)(s Session) {
	db := s.DB("mydb")
	PrintBobDetails(SessionContract.Database)(db)
}

func PrintBobDetails(type Database DatabaseContract)(db Database) {
	iter := db.C("people").Find(bson.M{
		"name": "bob",
	})
	... etc
}
...
var session SomeSessionType = ...
PrintVariousDetails(SomeSessionType)(session)

@rogpeppe
Copy link
Author

func ShortestPath(type Node NodeContract)(from, to Node) []NodeContract.Edge { ... }

This seems a bit odd to me. Does that type declaration declare NodeContract as a local identifier so it can be used in that NodeContract.Edge type expression? Presumably that implies that we can't have more than one contract in a function's type parameters (otherwise you'd have an ambiguity when the same contract was used twice). That being so, the only way to have multiple independent types in a contract would be to have multiple inputs to the contract.

For example, say we're implementing our own map type.

contract KeyValue(k Key, v Value) {
	k == k
}

Neither key nor value can be derived from one another, so neither can be an output type from the contract.

So I guess we'd need to define the map type like this:

type Map(type Key, Value KeyValue) struct {
	// unexported fields
}

But this seems very like contracts as they are in the draft proposal right now. It seems to me that any contract of the form:

   contract C(A1, A2, ..., An) (B1, B2, ..., Bn)

can be rewritten without loss of generality to:

contract C(A1, A2, ..., An, B1, B2, ... Bn)

The only distinction between "input" and "output" types is whether one type might be derived from another, and output types are entirely optional. In another sense, they're all "output" types (from the perspective of the body of a generic function) and they're all "input" types (from the perspective of a caller).

Unification can do a fine job of type inference. I'm not convinced that this fairly arbitrary separation into input and output types is going to help much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment