Skip to content

Instantly share code, notes, and snippets.

@faiface
Last active January 23, 2020 04:53
Show Gist options
  • Save faiface/e5f035f46e88e96231c670abf8cab63f to your computer and use it in GitHub Desktop.
Save faiface/e5f035f46e88e96231c670abf8cab63f to your computer and use it in GitHub Desktop.
Go 2 generics counterproposal: giving up restricting types

Go 2 generics counterproposal: giving up restricting types

"I want to make a generic function which works on any type that satisfies these and these constraints."

When we think about generics, this is the sort of problem that pops up into our minds. The critical part is restricting the set of types our function intends to work on. Trying to solve this problem of restriction has led people to what I call reasons why we hesitate to have generics in Go.

C++ templates with horrific error messages, or even it's new concepts, Java's T extends Comparable<T>, Rust's generic traits, Haskell's type classes, and sadly even contracts from the original generics proposal by the Go Team. Don't get me wrong, all of these approaches work and they work fairly well. But neither of them is very Go-like. All of them lack the simplicity and clarity that we value so much in Go.

I want you to consider an idea. The idea is: we can get 80% of the benefits of generics with 20% of the effort if we just drop the need to restrict types. You won't be able to make a function that only accepts orderable values, you won't be able to only accept types that implement an interface. You will only be able to define generic functions or types that work on any type whatsoever. (But, you'll be able to use == and friends.)

Of course, you may not believe this idea. It probably seems restrictive. But let me ask this question: What do we really miss when we miss generics in Go? We'll get back to this question at the end.

In this proposal, I'm going to try and show what we can achieve when we accept this idea and how simple, clear, and beautiful generics could be in Go.

Let's get to the specifics.

Generic functions

Let's start with the simplest generic function: the identity function. It just takes a value and returns it back. This algorithm obviously works the same for any type.

// this has a problem
func Identity(x T) T {
    return x
}

There's one problem with this code. We can't tell whether T is a generic type or is defined somewhere else.

The gen keyword

Here's how we solve the problem! We mark the first occurrence of a generic type with a new gen keyword. That, of course, stands for generic:

func Identity(x gen T) T {
    return x
}

And that's it! This would be a valid code under this proposal.

Rule. Mark the first (leftmost) and only the first occurrence of a generic type with the gen keyword in the signature by putting it in front of the type: T becomes gen T.

Rule. The identifier after gen is visible in the whole rest of the function (signature and body). The precise rules of shadowing need to be thought through. So far, I'd say that the generic identifier shadows any package-level identifier, and cannot be shadowed in the body of the function.

Here are a few more valid signatures:

func Sort(a []gen T, less func(T, T) bool)
func Merge(chans ...<-chan gen T) <-chan T
func Map(a []gen T, f func(T) gen U) []U   // yep, we get some functional programming

The marked occurrence will be used for inferring the generic type when we use the function. For example (the code is split into multiple lines for commentary):

var nums = []int{5, 4, 3, 2, 1}
Sort(
    nums,                     // T inferred to be int
    func(x, y float64) bool { // error: expected func(int, int) bool, got func(float64, float64) bool
        return x < y          // ^_^ nice error messages ^_^
    },
)

Because of type inference, the gen keyword can only be used in the list of arguments, not in the list of return types. Inferring based on the return type would be ambiguous when used with := and would probably send us to the Hindley-Miler world (nothing wrong with HM, just not a fit for Go).

Rule. In a function signature, the gen keyword can only be used in the list of arguments.

And just to be clear. This code right here is incorrect:

func wrong(x gen T, f func(gen U) U) T {
    return f(x) // error: expected U, got T
}

The func(gen U) U in the signature does not mean that the function f accepts any type. All it means is that f is some function that takes a type U and returns the same type. It could be a func(int) int, a func(map[string]bool) map[string]bool, or something else. You get the idea.

Also, one more rule:

Rule. The gen keyword can only be used in package level definitions.

Unnamed generic arguments

But what if the only occurrence of a generic type is in the return type? For example, what if we wanted to make a function that returns the zero value for any type?

func Zero() gen T { // error: gen only allowed in the list of arguments
    var x T
    return x
}

In a case like this, we can add an unnamed generic argument to the function:

func Zero(gen T) T {
    var x T
    return x
}

Then when calling the function, we pass the name of a type instead of a value:

n := Zero(int)     // n := 0
x := Zero(float64) // x := 0.0
s := Zero(string)  // s := ""
p := Zero(*int)    // p := (*int)(nil)
a := Zero([]int)   // a := []int(nil)

Notice that this is very similar to the built-in new and make functions. In fact, the new function could be implemented like this (make requires a bit more magic):

func new(gen T) *T {
    var x T
    return &x
}

Rule. Omit the name of a generic argument to accept a type name instead of a value. This cannot be used for generic types nested inside other types. For example: func f([]gen U) U isn't allowed.

What can we do with generic values?

First, different generic types are assumed to be different, so we can't assign a value of type T to U and vice versa. We can, however, freely assign values of type T to variables of type T or pass them to functions which accept T.

That's all we can really guarantee will work for any type in the most general case. But, I thought that enabling a few more things would greatly increase the usefulness of generics.

That's why under this proposal:

Rule. You can do anything with generic values that you are able to do with interface{} values. This means that you can:

  1. Compare values of the same generic type using == and !=.
  2. Use them as keys in maps.
  3. Convert them to interface{}.
  4. Use type assertions and type switches (for example, to check if T implements an interface).

Of course, there's no need for doing a type assertion when assigning from T to T.

Just to make it clear, you aren't allowed to use operators like +, -, <, call methods, or access struct fields of a generic value.

What will happen if a function requires comparing generic values with == but we use it with a type that doesn't support ==, such as a slice? The compiler should be always able to figure out if a function requires its type to be comparable with == without any sort of annotations. This would, therefore, result in a compile-time error. In a situation that the compiler wouldn't be able to figure it out (I'm not sure it's possible), a runtime panic would occur, just like with interface{}.

With these capabilities, we can implement a whole bunch of new generic functions!

// Keys returns a slice of keys from the map m.
func Keys(m map[gen K]gen V) []K {
    var keys []K
    for k := range m {
        keys = append(keys, k)
    }
    return keys
}

// Find finds the value x in the slice a and returns its index.
// Returns 0, false if the value isn't in the slice.
func Find(x gen T, a []T) (i int, ok bool) {
    for i = range a {
        if a[i] == x {
            return i, true
        }
    }
    return 0, false
}

// Unique returns a new slice same as a, with duplicate elements removed.
func Unique(a []gen T) []T {
    var unique []T
    seen := make(map[T]bool)
    for _, x := range a {
        if seen[x] {
            continue
        }
        unique = append(unique, x)
        seen[x] = true
    }
    return unique
}

And many more.

Generic array lengths

This is a part of this proposal that's definitely debatable and could get excluded. I thought: making generic functions is so straightforward, why not extend the genericity to lengths of static arrays as well?

The syntax would be the same as with generic types, except it'd be used in the place of an array length. For example:

// Reverse reverses a static array of an arbitrary length.
func Reverse(arr [gen n]gen T) {
    for i, j := 0, n-1; i < j; i, j = i+1, j-1 {
        arr[i], arr[j] = arr[j], arr[i]
    }
}

A generic length could also be used as an unnamed argument. This would allow making a static array constructor like this:

// Filled returns a static array of length n filled with the provided value.
func Filled(gen n, fill gen T) [n]T {
    var arr [n]T
    for i := range arr {
        arr[i] = fill
    }
    return arr
}

// then later
ones := Filled(10, 1) // ones has type [10]int

The generic integer n acts as an integer constant inside the function, it can't be mutated.

Rule. Use gen to specify generic array lengths. The length then acts as an integer constant inside the function. All rules are the same as with generic types: only mark the first occurrence, use an unnamed argument to accept the length directly.

Also, the value you pass for the unnamed generic array length must be a constant. This isn't allowed:

var n int
fmt.Scanln(&n)
ones := Filled(n, 1) // error: n must be a constant

Reflection and interface{}

Being able to convert a generic function to an interface{} value has many problems (the function itself, not just its return value). First of all, what's the type of a generic function? It has many types - an infinite number of them. That would make type assertions very hard. Also, you would be able to run the reflect package on the value. The reflect package would somehow have to incorporate working with generic types and would become very complicated.

Generics should have no impact on reflection. This was also discussed in the contracts draft design. Therefore, generic functions cannot be directly passed as interface{}.

Rule. Generic functions cannot be converted to inteface{} directly.

You can specialize them manually, though:

// error: cannot convert a generic function to interface{}
reflect.ValueOf(Sort)

// OK
reflect.ValueOf(func(nums []int, less func(x, y int) bool) {
    Sort(nums, less) // calling a generic function here
})

Generic types and methods

List example

Let's make the simplest generic type - a singly-linked list. This is how:

type List(T) struct {
    Value T
    Next  *List(T)
}

Rule. To define your own generic type, list the names of the type (or array length) parameters in parentheses right after the type name.

As you've surely noticed, there's no gen keyword in the definition. We don't put the gen keyword in the list of type parameters for the same reason we don't put the func keyword in the list of methods for an interface. Every item in that list is a generic type (or an array length) automatically.

Also, you can't put the gen keyword anywhere else in the definition either. That would make the type system really complicated.

Rule. No gen keyword in type definitions.

To use a generic type, we put the list of the type arguments after the name of the type just like in the definition, but instead of specifying the names of the arguments, we put in the actual types. You can already see it in the definition.

// error: T is not defined
var list *List(T)

// OK
var nums *List(int)

Rule. To use a generic type, put a list of actual types for each type argument right after the type name.

Now, let's implement some functionality for the new List type. A constructor for an empty list (not extremely useful, but we're just learning here):

func Empty(gen T) *List(T) {
    return nil
}

A method to prepend a new item:

func (l *List(gen T)) Prepend(x T) *List(T) {
    return &List(T){x, l}
}

Yes, we can use the gen keyword in the type of the method receiver.

And a function to construct a list from a slice:

func FromSlice(a []gen T) *List(T) {
    list := Empty(T)
    for i := len(a) - 1; i >= 0; i-- {
        list = list.Prepend(a[i])
    }
    return list
}

Now, what if we wanted to make a Map method to transform each element of a list using a function and produce a new, transformed list? This would be tempting:

// error: cannot use gen outside of the method receiver type
func (l *List(gen T)) Map(f func(T) gen U) *List(U) {
    // ...
}

Tempting but problematic. Since a List instance can be converted to an interface{}, its methods must be accessible by the reflect package. The above Map method is generic and so we'd run into the same problems we described with functions and reflection.

Therefore:

Rule. In a method definition, the gen keyword can only be used in the type of the receiver.

We can still implement Map, but it must be a function:

func Map(l *List(gen T), f func(T) gen U) *List(U) {
    // ...
}

This rule is easy to remember: gen only belongs in the first pair of parentheses. This rule of thumb works for both functions and methods.

Just for your information, this exact same limitation on methods is also in the contracts proposal by the Go Team. An alternative to this limitation would be to simply not show these problematic methods in reflection, but prohibiting them altogether seems cleaner.

Matrix example

Here's another interesting generic type: Matrix. This one uses the generic array lengths, so it's an actual use-case for that feature:

type Matrix(r, c) [r][c]float64

Types can also be generic in array lengths.

We can make a constructor for the identity matrix (1s on the diagonal, 0s elsewhere):

func I(gen n) Matrix(n, n) { // here we use an unnamed generic array length
    var m Matrix(n, n)
    for i := 0; i < n; i++ {
        m[i][i] = 1
    }
    return m
}

// later in code, this construct a 3x3 identity matrix
i3x3 := I(3) // has type Matrix(3, 3)

Adding two matrices and multiplying two square matrices is also easy:

func (a Matrix(gen r, gen c)) Add(b Matrix(r, c)) Matrix(r, c) {
    // ...
}

func (a Matrix(gen n, n)) Mul(b Matrix(n, n)) Matrix(n, n) {
    // ...
}

Notice that the Mul method only works on square matrices. General matrix multiplication cannot be a method, it must be a function instead:

func Mul(a Matrix(gen r, gen k), b Matrix(k, gen c)) Matrix(r, c) {
    // ...
}

Generic interfaces

This is an area that I haven't explored very deeply. A generic interface would be just like any generic type, except it would define an interface:

type Set(T) interface {
    Add(T)
    Delete(T)
    Contains(T) bool
}

As an example, here's a generic Set interface. It can be implemented by a hash-set, a tree-set, or a list-set.

A generic interface is, in fact, an infinite family of interfaces: Set(int), Set(string), and so on. For example, this type only implements the Set(int) interface:

type SliceIntSet []int

func (sis *SliceIntSet) Add(x int)           { /* ... */ }
func (sis *SliceIntSet) Delete(x int)        { /* ... */ }
func (sis *SliceIntSet) Contains(x int) bool { /* ... */ }

And this one implements Set(T) for any T (more precisely, given a concrete T, HashSet(T) implements Set(T)):

type HashSet(T) map[T]bool

func (hs HashSet(gen T)) Add(x T)           { /* ... */ }
func (hs HashSet(gen T)) Remove(x T)        { /* ... */ }
func (hs HashSet(gen T)) Contains(x T) bool { /* ... */ }

I'll leave it as an open question about whether to allow generic interfaces. While they may be useful, there's a danger that they'll allow creating crazy and hard to understand abstractions. But maybe not. This needs to be investigated better.

Conclusion or what do we miss?

Generics, as described in this proposal, make it possible to write a lot of generic functions and types. Containers, utility functions for slices and channels are all covered. Generic array lengths even enable type-checked matrices and utility array functions.

As far as I know, this proposal satisfies the dual implementation principle, which states that the choice between compile-time instantiation and runtime implementation of generics should always be up to the compiler.

There are some things, though, that aren't possible under this proposal. From the impossible, I'll list the things listed in the Proposal: Go should have generics article:

  • Generic number functions like Min, Max, Sum, Product, and so on.
  • Graph algorithms on generic graph implementations. Still possible to make a single generic Graph(N, E) type (where N is the type of data in nodes and E in edges) and use that.
  • Functions that work with both string and []byte.

And that's it, I think. Everything else listed there should be possible.

So, the question still is: What do we miss? Do we miss generic math functions and graph algorithms, or do we miss the things that this proposal enables? I'm not going to give an answer to this question, because I may be biased and the answer might be wrong. However, if the answer is the latter, I believe this proposal is not too shabby.

Further ideas

More kinds of generic parameters

Currently, the type-checker has to distinguish between three kinds of generic parameters:

  1. Any type.
  2. Any type with ==.
  3. Integer constant.

This list could be extended to include support for any numeric type (with operators like +, -, ...). This would make it possible to make functions like Min, Max, and so on.

Annotating the kinds of generic parameters

The kind of a generic parameters is currently completely implicit. Some annotation could be added. One possible syntax could be like this:

  1. gen n for generic array lengths.
  2. gen T for arbitrary types (convertible to interface{}).
  3. gen eq T for equalable types (comparable with == and usable as map keys).
  4. gen num T for numeric types (with operators like +, -, <, ...).

Array lengths and arbitrary types can have the same notation because it's always possible to distinguish them by context. Alternatively, they could be distinguished by capitalization (lower-case for array lengths, upper-case for types).

Syntax-wise, eq and num would not be keywords on their own. Rather, gen eq and gen num would be two-word keywords.

The syntax is rather ad-hoc, I admit. It's a very fresh idea, completely open to refinement. However, since there are only four cases, an ad-hoc syntax may actually be the right choice.

It's also easily extensible with new possible "contracts", but I'd generally advise against that.

The addition of the num restriction would actually enable many cool things. First, the Min/Max functions:

func Min(x, y gen num T) T {
    if x < y {
        return x
    }
    return y
}

It would also be useful for generic math types, like vector and matrices. The matrix example above uses float64, but it could be generalized to all numeric types.

As an example, let's take a look at a possible implementation of a 2D vector type:

type Vec2(T) struct {
    X, Y T
}

There are no restrictions specified in the type definition. This is because it's methods and functions that require restrictions, not the types themselves. For example, this String method requires no restrictions:

func (u Vec2(gen T)) String() string {
    return fmt.Sprintf("(%v, %v)", u.X, u.Y)
}

This Eq method requires the types to be comparable with ==:

func (u Vec2(gen eq T)) Eq(v Vec2(T)) bool {
    return u.X == v.X && u.Y == v.Y
}

But, this Add method requires the types to be numeric:

func (u Vec2(gen num T)) Add(v Vec2(T)) Vec2(T) {
    return Vec2(T){u.X+v.X, u.Y+v.Y}
}

Consequently, Vec2([]float64) would only have the String method, Vec2(string) would have the Eq method in addition, and Vec2(float64), Vec2(int32), and so on, would have all the methods.

@dpavsrtrl
Copy link

Maybe I missed something, but it doesn't seem to me that you're getting anything from restricting the set of operations on generic values as opposed to just duck-typing generic values.

Yes, what's the problem with inferring constraints from duck typing? Isn't that what go team's contracts do? Here you'd just do the inference at the point of use instead of referring to a definition.

@faiface
Copy link
Author

faiface commented May 31, 2019

@vojtechkral In current Go, type assertions are a runtime thing. Then don't at all affect the set of types a function accepts. I don't think it should be any different here. Of course, the compiler could optimize some type assertions to no-op, but that would be just an optimization.

@vojtechkral
Copy link

In current Go, type assertions are a runtime thing. Then don't at all affect the set of types a function accepts.

Yes, I'm aware, I'm only considering type assertion int the context of this proposal, where effectively it would work as a restriction on the accepted types in a similar way as operations on generic values in C++.

I don't think it should be any different here. Of course, the compiler could optimize some type assertions to no-op, but that would be just an optimization.

Well consider code like this:

func foobar(x gen T) {
    i := x.(int)
    // ... do something with `i` ...
}

func main() {
    foobar(3)          // This is ok
    foobar("hello")    // This fails in some way but how exactly?
}

Depending on whether the type assertion is evaluated at runtime or compile time, you either get a compile-time error or a runtime panic. It seems to me like a pretty bad idea to compile or reject the same code based on optimization settings or what the compiler 'feels' like doing in some situation.

I don't think you can leave that sort of question to the compiler as only a matter of optimization, there's to much implications. The same is true for boxed vs unboxed values, IMO.

@skybrian
Copy link

skybrian commented Jun 1, 2019

Another thing to think about: how will the generic function show up in godoc? For each generic type, it should explain whether it supports slices or not.

If this is declared in the function header in the source code, then it would happen automatically. Otherwise, godoc would have to infer it and present it to the user somehow.

What I'm getting at is: the constraint is part of the public API. This proposal has two possible constraints, depending on whether == is used or not in the body. You can leave them out of the source code, but the compiler will infer it, and it seems like it still should be documented and explained to the user.

Alternately, have a way to specify the constraint. You need two to start, but there might be more later.

@kaey
Copy link

kaey commented Jun 3, 2019

+ - are implemented for limited number of types and you say

Rule. You can do anything with generic values that you are able to do with interface{} values. This means that you can:
4. Use type assertions and type switches (for example, to check if T implements an interface).

Which means Min can be implemented like this

func Min(x, y gen T) T {
    switch v, w := x.(type), y.(type) {
    case int:
        if v < w { return v } else { return w }
    case float:
        if v < w { return v } else { return w }
    etc
    }
}

Whats different from interface{} here is that exact type is known at compile time.

@dpavsrtrl
Copy link

Do you really need to specify the kinds of generic parameters (n, eq, num). Can't the compiler infer the kind from usage? That applies to the "n" as well. What if the convention would be that lowercase identifiers will be const and uppercase types? Then you could possibly be generic on other primitive types, like bool, byte, etc.

@skybrian
Copy link

skybrian commented Jun 5, 2019

The compiler can infer all sorts of things. However, when the details of a public API are inferred from the implementation, it can be less clear who to blame (caller or callee) and so generating a good error message is harder.

@danielvladco
Copy link

I like this idea a lot. I also thought what if we use already existing keyword type instead of gen? For me it looks a little more descriptive than gen and it also doesn't add new keywords to the language..

@faiface
Copy link
Author

faiface commented Jul 8, 2019

@danielvladco Yes, I have actually made a new version of this proposal with an implementation that uses type instead of gen: https://github.com/faiface/generics

@skybrian
Copy link

skybrian commented Jul 9, 2019

Interesting, do you want feedback on the new proposal here or in a new thread somewhere?

@faiface
Copy link
Author

faiface commented Jul 9, 2019

Please, give it here. There was a Reddit discussion associated with the new version, so I received a lot of feedback there. At some point, I will make a GitHub issue for the new proposal, but until then, why not give the feedback here.

@skybrian
Copy link

skybrian commented Jul 19, 2019

Okay, one suggestion: might it be better to declare type variables separately? How about:

type var X, Y, Z

Or with constraints:

type var T ord

Or with multiple variables:

type var (
    T ord
    K eq
    V
)

The scope of a type variable declaration would be the current file, but the scope of each type variable appearing in a function would still be that function.

This would allow the syntax to be closer to Go's syntax for other things, and also encourage type variables to be used consistently within a file, without having to look too far away to see what the type variable's restriction is.

@wawe
Copy link

wawe commented Oct 25, 2019

I really like this proposal. Among the different generics proposals I have read, I'd consider this to be the most "Go-like".

@faiface I was wondering why you haven't raised a corresponding issue in golang/go yet. As I understand from https://github.com/golang/proposal that's encouraged even if not all details are worked out yet.

In an earlier comment, you said

At some point, I will make a GitHub issue for the new proposal, but until then, why not give the feedback here.

Are there still things you want to work out before making this proposal "official"?

My hope is that making this more visible via the issue tracker will increase the amount of feedback it gets. And since the Go team seems to be zeroing in on adding contracts to Go, maybe sooner is better than later.

@mhcoffin
Copy link

I like this proposal a lot, although I might argue about some syntactical choices. (I think sticking more closely to the contracts proposal as far as syntax might be wise.)

What I wanted to point out is that unconstrained types are perfectly general: you actually can implement anything you could with contracts if you're willing to put up with slightly less-than-ideal syntax. The idea is that, for every operation you want to perform on generic types, encapsulate that operation as a parameter of the appropriate func type. If there are several operations, package them into a struct. E.g., here's an implementation of generic reduce:

func Reduce(type T) (s []T, zero T, combine func(a, b T) T) T {
    result := zero
    for _, v := range s {
        result = combine(result, v)
    }
    return result
}

You could use it like this:

Reduce(int)([]int{1, 2, 3}, 0, func(a, b int) int {return a + b})

or like this:

Reduce(string)([]string{"hello", "world"}, "", func(a, b string) string {return a + b})

If you want to write a generic graph algorithm, you probably need a number of operations on vertices and edges. Package them into a generic struct

type Ops struct(type V, E) {
    // returns the vertices at either end of an edge
    Vertices func(E) (V, V)
    
    // returns the outgoing edges of a vertex
    Edges func(V) []E

    // etc,
}

Generic algorithm:

func IsDag(type V, E) (root V, ops Ops(V, E)) bool {
    initialEdges := ops.Edges(root) 
    for _, edge := range initialEdges {
        // etc
    }
}

To invoke, you would need to create an instance of Ops that works for your implementation of Graph. To me, this seems much more useful than a contracts-based version. If I try to use a contract-based algorithm, it probably won't work because I didn't write my graph package with that in mind. Then I have to stare at the contract to figure out how to make it fit. With this proposal, the code I need is really obvious: I can make it work by providing the right ops argument, which will probably be some straightforward adapter code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment