In my previous post, I talked about some of the underlying runtime concepts behind generics in Go.
Here I'll go a bit deeper into how we might, in principle, be able to compile a generic function once for many instantiated types.
Before we can start compiling functions, we'd like to know what the type parameters are to the functions. If a generic function is only ever called with a single type, there wouldn't be much point in compiling it with generic code that could work with any type at all.
There are two crucial restriction with generic types that make it possible to determine the set of all possible type parameters at compile time:
- there is no way to take a type from a dynamic value and turn it into a type parameter.
In an interface value, the type lives inside the value, but there's no way to get at that type without naming an existing type that is compatible with it.
- it is not possible to have dynamic values or closures that are generic.
Generic methods are not allowed, and there's no way to have a value (as opposed to a type or a top level function) that has a type parameter.
For example, consider this program:
package main
import "fmt"
func main() {
foo(43)
foo("hello")
}
func foo(type T)(x T) {
bar(&x)
bar(make(chan T))
}
func bar(type T)(x T) {
fmt.Println(x)
}
The arcs of the call graph look something like this:
main() -> foo(int)
main() -> foo(string)
foo(T) -> bar(*T)
foo(T) -> bar(chan T)
By traversing the call graph from the root (main) and keeping track of the types we end up with, we can determine all the possible type parameters for each function. Pseudo-code for a naive algorithm might look something like this:
addInstantiations(function, types) {
add (function, types) to set of instantiations
for each generic call in function body {
addInstantiations(target of call, types passed to call)
}
}
We'd obtain the following set of instantiations:
foo(int)
foo(string)
bar(*int)
bar(chan int)
bar(*string)
bar(chan string
One problem with this is that it falls down when functions are recursive, because the algorithm will never terminate. In fact, as is pointed out in the "unfortunate" BigArray example in the Go generics overview, the final set of types might be infinite.
That's reasonable straightforward to solve though. We can make it illegal to create cycles like this that can result in an arbitrarily large set of types. That is, if we find a recursive call where the types aren't the same as the types passed to the original call, we can give a compiler error. A modified algorithm might look something like this:
addInstantiations(function, types, stack) {
add (function, types) to set of instantiations
if function is in stack {
if types in stack are not the same as types {
compiler error: "infinite type cycle"
}
return
}
for each generic call in function body {
push (function, types) onto stack
addInstantiations(target of call, types passed to call, stack)
pop stack
}
}
Now that we've determined all the types that a generic function can be bound to, we have to decide what code to generate.
There is one easy option: each item in the set of instantiations gets compiled separately, so in the previous example, instead of a single compiled instance of main, foo and bar, we'd have main, two copies of foo and four copies of bar.
In a large program with many type combinations, this approach can result in huge binaries and runtime inefficiency when instruction caches overflow. So we'd like to avoid it, if possible.
Another possibility is that we could merge some generated implementations for types where we can generate identical code, with small runtime cost.
Specifically, it should be possible to merge two instantiations if all their argument types have the same size and pointer layout.
For example, let's look at our earlier example with this in mind. On a 64 bit architecure, the sizes and pointer maps look like this:
Instantiation | Size | Pointer map |
---|---|---|
foo(int) | 8 | [false] |
foo(string) | 16 | [true, false] |
bar(*int) | 8 | [true] |
bar(chan int) | 8 | [true] |
bar(*string) | 8 | [true] |
bar(chan string) | 8 | [true] |
We can see although bar
is invoked with 4 different type parameters, they all have
the same size and pointer map. As long as bar only copies the value around or
creates new zero instances of it (and that's all the contract of T allows), we can
generate only one instance of it.
The generated instance might look like this:
func bar(x unsafe.Pointer) {
fmt.Println(interfaceValue{
word: x,
})
}
However, this has omitted one important detail - bar
calls fmt.Println
,
which requires that we take the generically typed x
value and turn
it into an interface value. For that we need its type, which the above implementation
does not have access to.
Naively, one might think that the caller of a generic function could pass
any generic types as explicit parameter. So the generated bar
code might look like this:
func bar(T reflect.Type, x unsafe.Pointer) {
fmt.Println(interfaceValue{
typ: T,
word: x,
})
}
There's a problem with that though, which is that functions have access to
more types than just those in its type parameters. For example, the above
foo
function is passed the single type T but uses that to create both *T
and chan T
.
Theoretically a function could use the reflect package's functions (SliceOf
, StructOf
, etc)
to dynamically construct the types, but this would be prohibitively inefficient.
But there's a neat alternative. As we've already generated a table of all the possible instantations of each function, we can store some representation of that table in a global variable. Then all the static information about a particular instantiation can be obtained with just one integer index into the table.
The whole program might now look something like this. Note that when we've generated separate versions of a function, there's no need to pass it an instance index because each version "knows" which set of types it was instantiated with.
package main
import "fmt"
func main() {
foo__int(43)
foo__string("hello")
}
func foo__int(x int) {
bar(0, &x)
bar(1, make(chan int))
}
func foo__string(x string) {
bar(2, &x)
bar(3, make(chan string))
}
type instance {
types []reflect.Type
}
var _barInstances = []instance{{
types: []reflect.Type{
reflect.TypeOf((*int)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((chan int)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((*string)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((chan string)(nil)),
},
}}
func bar(_instance int, x unsafe.Pointer) {
fmt.Println(interfaceValue{
typ: _barInstances[_instance].types[0],
word: x,
})
}
What happens if bar
calls another generic function that has a shared implementation?
We can add more information to the instance
type above. Given an instance
and a specific call in the function, we know which function is being called, which
tells us which instance of that function we require and hence the instance index to
use when calling that function.
type instance {
types []reflect.Type
// calls holds the instance index for each call site.
calls []int
}
When a function has both shared and non-shared instances, we'd need to generate a stub for the non-shared instance to absorb the unneeded instance index parameter.
Being able to have shared instances of generic functions is good, but there's one important omission so far. We can't do anything with the generic values.
Luckily, the instance table global variable can be used here too. Inside the instance struct we can store a set of function pointers, one for each operation that the function's contracts allow. Consider the following code, for example:
contract comparer(t T) {
t == t
}
func indexOf(type T comparer)(xs []T, v T) int {
for i, x := range xs {
if x == v {
return i
}
}
return -1
}
The only operation allowed by the contract is ==
, and that's easily represented
with a function; for example:
func intEqual(a, b int) int {
return a == b
}
When the contract allows an interface type, there's no need to generate a new function - we already know how to generate runtime call tables for interfaces and the generated call can just invoke that.
For more complex operations permitted by the new generics design, such as range
and field
indexing, it may not be possible to express them as a function call. In this
case it's always possible to fall back to generating one instance per set of type parameters.
I believe they should be disallowed but that's a matter for further discussion.
In summary, I believe that we can use shared implementations for Go generic functions, solving Russ's generic dilemma, and that the runtime cost can be very small.
You make a perfectly valid point. Note that this is exactly what any implementation of generics that expands code inline must do, though.
I think it would be possible to compile once for all type, but it would be rather inefficient.
For example, when type sizes are unknown at compile time, either the stack frame must be variable in size (which I believe is currently a no-no) or all the values must be allocated as pointers from the stack. Moreover instructions which in non-generic code would be either immediate (e.g. field lookup) or direct (array index) addressing would need to obtain their offsets from the instance table. It also means that any operation on a generic value (including arithmetic operations) will incur the overhead of a function call. Maybe this might turn out OK for many cases, but it would definitely have significant impact in tight loops. Compiling once for all types is definitely an option, but I don't think it should be the main one.
Another possibility is to compile a small number of versions of the code, one general version that can handle any type, and maybe one more for a single pointer and/or one for a single word-sized value. Then if you call with any other set of types than which it's compatible, you get the slow version. You could use the clauses in the contract to guide which size/pointer-map combinations to optimize for.
Note that inline functions are already compiled with inter-package analysis - that is, the source code is available when the compiler sees the call. So that could happen with generic functions too, although by contrast with inlinable functions, generic functions may well be large.
In summary, I guess from your perspective, yes, we've still got the generics dilemma, but I think that's fundamental. At least we have the possibility to move the (code bloat, slow compiler, fast execution) <-> (small code, fast compiler, slow execution) slider in either direction if we choose.