Skip to content

Instantly share code, notes, and snippets.

@rogpeppe
Last active November 3, 2021 16:14
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rogpeppe/f9216e5f2b1c99fc13108825dbda6181 to your computer and use it in GitHub Desktop.
Save rogpeppe/f9216e5f2b1c99fc13108825dbda6181 to your computer and use it in GitHub Desktop.

Go generics at runtime (part 1 of 2)

From the design document:

Generic functions, rather than generic types, can probably be compiled using an interface-based approach. That will optimize compile time, in that the package is only compiled once, but there will be some run-time cost.

Generic types may most naturally be compiled multiple times for each set of type arguments. This will clearly carry a compile time cost, but there shouldn’t be any run-time cost. Compilers can also choose to implement generic types similarly to interface types, using special purpose methods to access each element that depends on a type parameter.

One issue with C++ is the huge size of binaries because every generic function is expanded once for each set of type parameters. I would argue that this is something we don't want to see in Go, and also that, far from having no runtime cost, there are significant runtime costs - the size of the compiled binary is one, but also the fact that having more code means you're less likely to fit within the processor's instruction cache.

In this document and a followup, I'd like to explore some possible runtime implications of the new Go generics design and how shared-code generic functions might be implemented. I'm assuming for the purpose of argument that my suggestions here are implemented.

First, let's take an overview of how generic types work at a high level.

Whenever you use a value in your program, the things you can do with it depends on the static type of the value - that is, the type that the compiler knows about (by contrast with the dynamic type, a type associated with the value that's known only at runtime).

Static types provided by the Go language include all the usual concrete types (integers, floats, slices, channels) etc but also they include interface types.

When you use the type assertion (x.(T)) operator, you're asking the Go runtime to look at the dynamic type of v, decide whether it's compatible with T, and if so, return it as a value with that static type.

We're already familiar with interface values, which hold the dynamic type alongside the value itself. Such values can be stored anywhere any other value can be stored, and the type always stays alongside the value, which means that they come with an inevitable runtime cost: the size of an interface value is larger than the size of the dynamic value it holds becasue it also has to hold the type information.

By contrast, generic values (also known as type-parameterized values) do not hold the type as part of the value. Instead, the type is passed as a type parameter. When you call a generic function, you also pass in the types of the generic values that you're passing in, just like you'd pass any parameter to a function. (Note: we can often optimize away the actual runtime overhead of this - see later - but the result is the same).

So when you use the type assertion operator on value with a generic type, the Go runtime can determine its dynamic type not by looking at the value itself, but by looking at its type parameter on the stack. This means that generic values can be exactly the same size as their dynamic type.

This is huge: it means that any data structures containing generic types can be entirely interchangeable with data structures containing their non-generic dynamic values. So you can have slices of generic values that are exactly the same as the corresponding slices of non-generic values, which is something we can't do with the interface values we have in Go today.

To illustrate, here's a program that prints 1 and 2.

func main() {
	printAny(1, 2)
}

func printAny(x, y interface{}) {
	fmt.Println(x, y)
}

And here's a logical representation of how that looks to the runtime.

generic-func-frame1

Note that each interface value has its own associated type.

Here's the same program using a generically typed function:

func main() {
	printAny(1, 2)
}

func printAny(type T)(x, y T) {
	fmt.Println(x, y)
}

See how this looks:

generic-func-frame2

We can see that the type is held separately to the values themselves - the type is held only once for both values. There is a one-to-many relationship from the generic types and the values they characterize.

Despite that, the concept is still the same: a dynamically typed value, whether it's generic or stored in an interface value, is the logical pairing of the dynamic value and its type. In this sense, both generic values and interface values are the same.

This is why we can use interface types to restrict the possible types of both generic and interface values - they're restricting exactly the same thing.

Consider the following program:

type Fooer interface {
	Foo() int
}

type myFooer struct{}

func (myFooer) Foo()

func main() {
	printFooer1(myFooer{})
	printFooer2(myFooer{})
}

func printFooer1(f Fooer) {
	fmt.Println(f.Foo())
}

func printFooer2(type T Fooer)(f T) {
	fmt.Println(f.Foo())
}

In printFooer1, the Fooer interface is used to restrict an individual interface value. A value in the parameter list is qualified with the static Fooer type.

In printFooer2, the Fooer interface is used to restrict a type that is then used to qualify the parameter value.

The end result is similar though, we know that the dynamic values must implement Foo.

The generics design, however, also adds the notion of contracts. A contract goes beyond an interface - it can associate multiple types, and it also allows the use of operators with dynamic values.

Because an interface values can hold only exactly one type, and most operators take more than one operand (and usually require the arguments to be of the same type), we cannot use a contract to restrict an interface type. We can only use contracts to restrict generic type parameters.

In my next post, I'll hope to talk about how we might actually go about compiling code for generic functions in Go and try to work around Russ Cox's generic dilemma.

@Merovius
Copy link

Merovius commented Sep 3, 2018

So when you use the type assertion operator on value with a generic type, the Go runtime can determine its dynamic type not by looking at the value itself, but by looking at its type parameter on the stack. This means that generic values can be exactly the same size as their dynamic type.

This is huge: it means that any data structures containing generic types can be entirely interchangeable with data structures containing their non-generic dynamic values. So you can have slices of generic values that are exactly the same as the corresponding slices of non-generic values, which is something we can't do with the interface values we have in Go today.

I'm not sure about this reasoning. For one, if you pass (say) an []int as an interface{Get(i int) int, Set(i, v int), …}, instead of a []interface{…}, the difference goes away. So this isn't really an argument for or against boxing but just about how to do boxing.

The other thing is, that the main downside of interfaces (I guess) is that they need to be copied to maintain the semantics of interfaces. That issue doesn't actually go away with generics. If i call Foo(type T ctrct) (a, b T) as Foo(int)(x, y), then x and y need to be copied, so that Foo doesn't modify it.

I.e. I don't really see this difference you are painting as a difference. It's an implementation detail of how you implement generics, not a fundamental property of the semantics of interfaces and generic arguments. To me, what you are describing here are what I'd call boxed generics. In a sense, the compiler can look at the contract of a generic function and generate an implicit interface containing methods for all the operations needed by the generic function. When instantiating the generic function and passing arguments, the compiler can then generate the implementation implicitly (similar to embedding) for the generic function to be invoked. You still end up with only one type descriptor on the stack; but it describes the entire contract, not just the type parameter T.

I'm sure you're aware of all of this. I'm just not really sure where you're going with this. To me, what you are describing here is exactly the "interface based approach" that is hinted at in the design.

@rogpeppe
Copy link
Author

rogpeppe commented Sep 3, 2018

I don't really see this difference you are painting as a difference. It's an implementation detail of how you implement generics, not a fundamental property of the semantics of interfaces and generic arguments.

I believe it really is a fundamental difference between universal (type parameter) and existential (interface value) types.
With universal types, a constant number of known types (on the stack) parameterizes all dynamically typed objects reachable from the stack; with existential types, each dynamically typed object on the stack or heap parameterizes itself.

@Merovius
Copy link

Merovius commented Sep 3, 2018

I'm not sure what you are saying. Are you agreeing or disagreeing? In particular re "this essentially describes the interface based approach mentioned in the design"? In any case, it's exactly what I mean when I talk about a boxing approach, but I think I mentioned that before ^^

@rogpeppe
Copy link
Author

rogpeppe commented Sep 3, 2018

AIUI when people say "boxing" they mean that every individual value lives in its own box, not that there is a box around everything, which is what I think you're describing. The latter is just one implementation choice for type parameterization (the other extreme being generating different code for every possible set of type parameters).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment