Skip to content

Instantly share code, notes, and snippets.

@rogpeppe
Last active September 14, 2018 07:18
Show Gist options
  • Save rogpeppe/7ea0cb6037aa520934257bf88a1012c5 to your computer and use it in GitHub Desktop.
Save rogpeppe/7ea0cb6037aa520934257bf88a1012c5 to your computer and use it in GitHub Desktop.
Go Contracts as type structs

Go contracts as type structs

Roger Peppe, 2018-09-05

In my previous post, I talked about an issue with a real world use case for contracts - the fact that all type parameters must be mentioned every time a contract is used in a definition. In this post, I introduce an idea for a possible way to fix this.

The most important feature of contracts, in my view, is that they bind together several types into one unified generic relationship. This means that we can define a function with several type parameters from a contract and we can be assured that they all work together as we expect.

In some sense, those type parameters really are bundled together by a contract. So... what if we changed contracts to be a little more similar to that familiar construct that bundles things together: a struct?

Here's the graph example from the design doc rewritten with this idea in mind. I've rewritten the contract to use interfaces conversions, because I think it's clearer and less ambiguous that way, but it's not crucial to this idea.

package graph

type Edger(Edge) interface {
    Edges() []Edge
}

type Noder(Node) interface {
    Nodes() (n1, n2 Node)
}

contract Contract(n Node, e Edge) {
    Edger(n)
    Noder(e)
}

type Graph(type G Contract) struct { ... }
func New(type G Contract)(nodes []G.Node) *Graph(G) { ... }
func (*Graph(G)) ShortestPath(from, to G.Node) []G.Edge { ... }

Instead of declaring all the type parameters positionally every time we use a contract, we now use a single identifier. This acts as a kind of "type struct" - we can select different types from it by name using the familiar . operator. It's as if the contract is a type that contains other types, which seems to me to fit nicely with its role as a "bundler" of several types and their relationship to one another.

Note that when a contract is defined, its parameters are a little different from function parameters - not only do the parameters need to be distinct, but the type names do too, so there's no potential for ambiguity here.

Using named rather than positional type parameters means that we don't need to remember which order the type parameters are in. Passing the contract around as a whole makes the code easier and shorter.

It has the additional advantage that there's now no syntactic need for type parameters to use only one contract.

What we haven't covered yet is how we might actually create an instance of G in the first place. We can use a similar syntax to struct literals, except here we're associating type members of the contract with their actual types.

g := graph.New(graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
})([]myNodeType{...})

We could even define this as its own type:

type myGraph = graph.Contract{
    Node: myNodeType,
    Edge: myEdgeType,
}
g := graph.New(myGraph)([]myNodeType{...})

In cases where type unification fails (for example when we need to specify a type that's in a return parameter), this should allow us to avoid the burden of passing the types each time.

Let's rewrite the example from my previous post to use this new syntax.

func PrintVariousDetails(type Mgo SessionContract)(s Mgo.Session) {
	db := s.DB("mydb")
	PrintBobDetails(db)
}

func PrintBobDetails(type Mgo SessionContract)(db Mgo.Database) {
	iter := db.C("people").Find(bson.M{
		"name": "bob",
	})
	... etc
}

I hope we can agree that this seems quite a bit nicer than the original.

@stevenblenkinsop
Copy link

I think you're wondering what happens if you do:

var x myGraph

I'm more meaning that in

type myGraph1 graph.Contract(...) // no `=` sign
type myGraph2 graph.Contract(...) // no `=` sign

the fact that myGraph1 and myGraph2 are not identical to each other is kind of useless. It only really makes sense to alias contract instances. The reason this matters is that if you can declare (rather than alias) contract instances, then it's no longer possible to infer the unique contract instance based on the inputs to a function, since there can be more than one instance with the same underlying type. I suppose you could just specify that the inference algorithm always infers the literal contract instance since it doesn't really make a difference.

I'm not sure I like the special-cased "input" and "output" distinction here. As I see it, the contract argument types are just related to one another - they're all output types in that sense. For example, in a graph with nodes and edges, which is superor?

You could make them both inputs in that case. Or apply the contract to a Graph type and have the nodes and edges as outputs. Or just have a different contract depending on which way you need the type inference to work, like

contract NodeType(node Node) (edge Edge) { ... }
contract EdgeType(edge Edge) (node Node) { ... }

Since you're already defining generic interfaces for each, those could probably serve in place of these contracts, with the type parameters of the interface being the outputs of the corresponding contract.

I'm not sure what you're getting at here. Please explain the issue a little more so my small brain can cope :)

I mean that you can't infer the session type based in the database type. The type implication only goes one way, where you can be given a session type and can infer the database type based on it, but not vice versa. If your function accepts a database as its input (like PrintBobDetails does), but accepts a session type as its type parameter, you'll never be able to infer the type parameter representing the session type and it will have to be explicitly provided by the caller. But the body of the function doesn't actually use the session type anyways. I'm pointing out that the way around this is to have a distinct DatabaseContract which doesn't require a Session type in the first place.

@rogpeppe
Copy link
Author

I think you're wondering what happens if you do:

var x myGraph

I'm more meaning that in

type myGraph1 graph.Contract(...) // no `=` sign
type myGraph2 graph.Contract(...) // no `=` sign

the fact that myGraph1 and myGraph2 are not identical to each other is kind of useless. It only really makes sense to alias contract instances. The reason this matters is that if you can declare (rather than alias) contract instances, then it's no longer possible to infer the unique contract instance based on the inputs to a function, since there can be more than one instance with the same underlying type. I suppose you could just specify that the inference algorithm always infers the literal contract instance since it doesn't really make a difference.

This is perhaps similar to the fact that interface types get distinct identities when they don't technically need to. There's a Go issue for that, and I think I favour using structural equality rather than nominal equality both for contracts and interfaces.

I'm not sure I like the special-cased "input" and "output" distinction here. As I see it, the contract argument types are just related to one another - they're all output types in that sense. For example, in a graph with nodes and edges, which is superor?

You could make them both inputs in that case. Or apply the contract to a Graph type and have the nodes and edges as outputs.

That's a bit different - that means that you'd need an actual value of that type, I think, but that's not really necessary.
I'd like to see the actual code for it.

Or just have a different contract depending on which way you need the type inference to work, like

contract NodeType(node Node) (edge Edge) { ... }
contract EdgeType(edge Edge) (node Node) { ... }

Since you're already defining generic interfaces for each, those could probably serve in place of these contracts, with the type parameters of the interface being the outputs of the corresponding contract.

I'd like to see this spelled out in a bit more detail. What would the contract bodies look like in both those cases, and what would the ShortestPath definition look like?

I'm not sure what you're getting at here. Please explain the issue a little more so my small brain can cope :)

I mean that you can't infer the session type based in the database type. The type implication only goes one way, where you can be given a session type and can infer the database type based on it, but not vice versa. If your function accepts a database as its input (like PrintBobDetails does), but accepts a session type as its type parameter, you'll never be able to infer the type parameter representing the session type and it will have to be explicitly provided by the caller. But the body of the function doesn't actually use the session type anyways. I'm pointing out that the way around this is to have a distinct DatabaseContract which doesn't require a Session type in the first place.

Yes, you could have multiple layered contracts, each one adding a new type parameter. That's pretty much the same as it would be in the current proposal, I think, except at least the types would named, so the distinction between the 5-tuple type parameters and the 4-tuple type parameters is a little clearer.

@stevenblenkinsop
Copy link

stevenblenkinsop commented Sep 12, 2018

I'd like to see this spelled out in a bit more detail. What would the contract bodies look like in both those cases, and what would the ShortestPath definition look like?

type Edger(Edge) interface {
    Edges() []Edge
}

type Noder(Node) interface {
    Nodes() (n1, n2 Node)
}

contract NodeContract(node Node) (edge Edge) { 
    Edger(Edge)(node)
 }

contract EdgeContract(edge Edge) (node Node) {
    Noder(Node)(edge)
}

func ShortestPath(type Node NodeContract)(from, to Node) []NodeContract.Edge { ... }

/* or potentially */ 

func ShortestPath(type Node Edger)(from, to Node) []Edger.Edge { ... }

Yes, you could have multiple layered contracts, each one adding a new type parameter. That's pretty much the same as it would be in the current proposal, I think, except at least the types would named, so the distinction between the 5-tuple type parameters and the 4-tuple type parameters is a little clearer.

My point is that you only need to accept exactly one type parameter as an input and the rest would be outputs in your mgo example, so positional vs named parameters becomes moot:

func PrintVariousDetails(type Session SessionContract)(s Session) {
	db := s.DB("mydb")
	PrintBobDetails(SessionContract.Database)(db)
}

func PrintBobDetails(type Database DatabaseContract)(db Database) {
	iter := db.C("people").Find(bson.M{
		"name": "bob",
	})
	... etc
}
...
var session SomeSessionType = ...
PrintVariousDetails(SomeSessionType)(session)

@rogpeppe
Copy link
Author

func ShortestPath(type Node NodeContract)(from, to Node) []NodeContract.Edge { ... }

This seems a bit odd to me. Does that type declaration declare NodeContract as a local identifier so it can be used in that NodeContract.Edge type expression? Presumably that implies that we can't have more than one contract in a function's type parameters (otherwise you'd have an ambiguity when the same contract was used twice). That being so, the only way to have multiple independent types in a contract would be to have multiple inputs to the contract.

For example, say we're implementing our own map type.

contract KeyValue(k Key, v Value) {
	k == k
}

Neither key nor value can be derived from one another, so neither can be an output type from the contract.

So I guess we'd need to define the map type like this:

type Map(type Key, Value KeyValue) struct {
	// unexported fields
}

But this seems very like contracts as they are in the draft proposal right now. It seems to me that any contract of the form:

   contract C(A1, A2, ..., An) (B1, B2, ..., Bn)

can be rewritten without loss of generality to:

contract C(A1, A2, ..., An, B1, B2, ... Bn)

The only distinction between "input" and "output" types is whether one type might be derived from another, and output types are entirely optional. In another sense, they're all "output" types (from the perspective of the body of a generic function) and they're all "input" types (from the perspective of a caller).

Unification can do a fine job of type inference. I'm not convinced that this fairly arbitrary separation into input and output types is going to help much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment