Skip to content

Instantly share code, notes, and snippets.

@HALtheWise
Last active March 16, 2019 18:50
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save HALtheWise/e7db03557ad52b9f9fa2722b4ef0f41e to your computer and use it in GitHub Desktop.
Save HALtheWise/e7db03557ad52b9f9fa2722b4ef0f41e to your computer and use it in GitHub Desktop.

go2 proposal: simple generics using const struct fields

Eric Miller, March 2019

Motivation

Existing proposals for generics in go2 propose various ways of extending the go language by adding significant new mechanisms and features (type parameters, contracts, new syntax, etc...) to the language. This proposal attempts to accomplish the same goals in a way that is a more natural extension of existing go features, in many cases by relaxing existing constraints in natural ways or building at the intersection of existing features and keywords. Ideally, the goal is to simplify the language at the same time as adding new functionality. That is a very ambitious goal, but is nevertheless one that I believe is worth targeting.

Overview

I propose extending go's current struct type to allow const fields, including but not limited to const type fields which are populated with concrete type references when the struct is initialized. Methods can reference const fields of their reciever in their arguments list, return list, and body, allowing generic functions. Certain constraints apply to how these structs are declared, referenced, and mutated to ensure that the value of all const fields of a struct are known at compiletime, allowing the compiler to optimize aggressively.

Discussion on this proposal (either structure or syntax) can happen in the comments here, on the golang-dev mailing list, or in the Github issue.

Samples

skip past samples

Tuple

package tuple

type Tuple struct {
	ELEM const type
	A, B ELEM
}

func (t *Tuple) Swap() {
	var c t.ELEM = t.A
	// Or just "c := t.A" like standard Go
	t.A = t.B
	t.B = c
}
package main

import (
	"fmt"
	"containers/tuple"
)

func main() {
	t := tuple.Tuple{int, 2, 4}

	t.Swap()

	fmt.Println(t.A, t.B)  // prints “4 2”
}

Sorter

package sort

import (
	"types"
)

// Sorter is a size-0 struct that provides utility methods for sorting slices.
type Sorter struct {
	ELEM const type {
		// This can either list other types, including interfaces, that the type 
		// must be castable to,or special constraints from the types package that 
		// relate to  built-in operators, like the Orderable constraint here.
		types.Orderable
	}
}

// Sort sorts the provided slice in-place, and returns the smallest element
func (s Sorter) Sort(l []s.ELEM) s.ELEM {
	// ...
	if len(l) == 0 {
		return nil
	}
	return l[0]
}
package main

import (
	"fmt"
	"sort"
)

type Length float64

const lengthSorter = sort.Sorter{Length}
// The methods of lengthSorter act like functions in a package customized just for your data type, 
// and can be used anywhere in this package.

func main() {
	lengths := []Length{3, 2, 1}

	shortest := lengthSorter.Sort(lengths)

	fmt.Println(shortest, lengths)  // prints “1 {1, 2, 3}”
}

Graph Utilities

package graphutils
//...

type WrappedGraph struct {
	NODE, EDGE const type {types.Hashable} // Allows use as map key
	GRAPH const type { // If concrete, enables compiler inlining
		interface {
			Edges(NODE) []EDGE
			Nodes(EDGE) (NODE, NODE)
		}
	}
	
	GRAPH // Anonymous embed allows easy access to all original methods
}

func (g *WrappedGraph) ShortestPath(start, end g.NODE) []g.NODE { /*...*/ }

func (g *WrappedGraph) IterNodes(start g.NODE) <-chan g.NODE { /*...*/ }

func (g *WrappedGraph) IterEdges(start g.NODE) <-chan g.EDGE { /*...*/ }
package main
//...

type Person struct {/* ... */}
type Friendship uuid.Uuid
type Network struct {/* ... */}

func (s *Network) Edges(p Person) []Friendship {/* ... */}
func (s *Network) Nodes(f Friendship) (Person, Person) {/* ... */}
func (s *Network) Owner() string { return "Zuckerberg" }

func Wrap(n *Network) (wg graphutils.WrappedGraph[Person, Friendship, *Network]) (
	wg.Network = n
	return
)

func main() {
	graph := Wrap(Network{})
	// ...
	
	fmt.Println(graph.ShortestPath(me, you))
	// prints “{Person{...}, ..., Person{...}}”
	
	fmt.Println(graph.Owner())
	// prints "Zuckerberg"
}

How const type fields work

Struct types, in addition to zero or more standard fields, may contain zero or more const fields, which must be typed constants (N const int), including a newly-introduced "type" constant (T const type). By convention, these fields are in all-caps. Fields declared later in a struct may reference const fields declared earlier, but the reverse is not true.

Optionally, a const type field may declare constraints that the type it holds must satisfy. These constraints are a superset of an interface declaration, and can also list arbitrary other go types that the value must be castable to, or specific exported "magic interfaces" from the new types package which represent compatiability with builtin go operators.

A type with unspecified const fields is an "underspecified type" and can only be used directly in type declarations, method reciever declarations, or struct literals that explicitly specify the missing const fields. For other uses, including function arguments or return values or new() / make() calls, a "fully specified type" must be used, which can be created from a underspecified type through a syntax like that used in struct literals (new(sort.Sorter{int32}) or new(sort.Sorter{ELEM: int32})). This ensures that the compiler and other tooling can always determine the exact value of all const fields on any declared variable, and thus the concrete types associated with its methods. See syntax discussion below.

Methods may use const fields of their method reciever as if they were constants within their parameter list, return values list, and body. This includes const type fields, which can be used to declare variables. Within the method body, these variables can be used in any way which is implied by any of their constraints.

Uses of const fields

Fields that are const are 0-size, and can be referenced in a number of ways.

  • Accessed just like a normal field, with dot notation
  • Used in the types of method arguments or method return values
  • Used in places that would normally require a constant value, like variable type declarations

Limitations of const fields

Just like any other const value in go, const fields cannot be...

  • Reassigned after declaration
  • Referenced by a pointer (they have no address at runtime)
  • Initialized without a value (they have no zero value)

Struct variables containing const fields inherit some of these limitations, and some others. Namely:

  • They cannot be mutated by variable assignment if it would change any const fields
  • They cannot be declared in a way that does not initialize all const fields. This puts more burden on the user of the library to be explicit about what they intend, since values will not be magically inferred by the compiler.
  • Methods of underspecified types cannot be referenced as a method expression. Very little go code actually uses this syntax, and it inherently doesn't make sense in this model.

Supporting generalizations of current go mechanisms

Currently, constants in go are extemely powerful, but occasionally surprising to new users of the language because they only operate on builtin types. For example:

type myInt int

type myInt2 struct{ int }

func main() {
	const myInt = myInt(5)     // Succeeds
	
	const myInt2 = myInt2{5} // Syntax Error
}

Fixing this would clean up an inconsistency in the language, and allow for eliminating some uses of global variables for things like default configuration structs or data tables.

I propose that a constant be able to hold a value of any type. The compiler will need to choose for large constants whether to store them one place in memory and reference that from everywhere or instantiate new copies at the site of each usage.

In Example 2, this mechanism was used to declare lengthSorter as a package-level constant. Without this, lengthSorter would simply need to be a package-level global variable, which requires more documentation about whether it maintains internal state and/or is safe for concurrent use. I like how lengthSorter.Sort(...) looks similar to a call to a Sort function defined in package lengthSorter, where declaring lengthSorter effectively initializes a custom "package" in your code.

Advantages of this approach

  • Because the use of a generic method is always preceeded by the declaration of a variable with that type, compiler errors can target that declaration when doing so makes for clearer messgaes. For example, attempting to declare a Sorter instance in Example 2 with a struct{} type would error at the site of that declaration, because struct{} is not orderable, which Sorter.ELEM requires. Once the lengthSorter has been declared, its methods have clearly-defined concrete types (in this case func([]Length) Length, so error messages about using such a method incorrectly are clear and unambiguous.
  • Because all code templating happens based on const fields in method recievers, it is easy to reason about what the impact on binary size of declaring a new instance of a generic type is. At worst, it could instantiate a copy of every method that the type has (... probably, see discussion below about recursion limits).
  • Because every const type is explicitly filled, callers can choose to use interface types rather than concrete types to pay the CPU cost of indirection instead of the binary size and compiletime cost of specialization. For example, a caller in Example 1 could instead choose to create a Tuple{interface{}, 2, 4}, (presumably) saving binary size and compile time if Tuple{interface{}} is used elsewhere in the code.

Disadvantages of this approach

  • This proposal intentionally does not support generic standalone functions, instead requiring the creation of a (potentially temporary) variable representing the specialized version of a set of generic functionality (sort.Sorter{int}.Sort(...) instead of sort.Sort(int, ...)). While the proposal could trivially be adapted to allow const function arguments as well as method recievers, I think these cases are actually quite uncommon outside of the toy examples often used in discussions of generics, and the pattern used in Example 2 handles this cleanly by creating a "pseudopackage" lengthSorter.
  • The pattern of libraries providing constructor functions requires somewhat cumbersome workarounds like thing := ThingMaker{int}.Make(5) or thing := Thing{int}; thing.Initialize(5) without some sort of "const function" also added to the language. Much of the beauty of this proposal comes from not allowing standalone generic functions, so I'm not sure how to address this issue without breaking that.
  • Allowing the existence of a type that does not have a zero-value potentially creates confusion, and might break other mechanisms in go. I think that allowing the brace "const field specifier" syntax in appropriate places fixes most of these, but it still feels icky to not allow zero-values.
  • This framework does not easily allow arbitrary Turing-complete computation at compiletime, which means it cannot easily be used for things like building a fmt.Printf(...) equivalent that is fully type-safe by parsing the provided format string. Most other proposals for go also do not satisfy this, but languages like Rust with arbitrary compiletime code execution do.

Open Question 1: Syntax

The examples in this article use struct{ N const int } as the syntax for declaring const fields. There are good arguments for other orderings as well, including struct{ const N int } or even struct{ N int const }.

The explanation in this article uses curly braces for creating a fully-specified type derived from a underspecified type (new(Sorter{int}) or new(Sorter{ELEM: int})) for consistency with struct literals. It might be better to use square brackets for consistency with map types (new(Sorter[int])) or even something completely new like angle brackets (new Sorter<ELEM: int>)).

As currently proposed, this proposal adds zero new reserved words to go. It might be desireable to add slice as a reserved word such that builtin generic types can be unified under the new syntax (map{key, value}, slice{T}, and chan{T, Send:True, Recieve:False}). chan is particularly ugly here, so if anyone has better suggestions, I would love to hear them.

Open Question 2: Runtime implementation

It would be nice if the compiler could choose to implement type specialization at runtime instead of compiletime at its heuristic discression, and it is unclear how difficult that is under this proposal. At minimum, it would require either variable-sized runtime stack allocations or new under-the-hood variants of slices and interfaces that automatically copy the backing storage whenever the value is copied, such that a version of arrays and structs could be created that transparently act like their concrete counterparts. I would love to hear from someone with more experience with the compiler regarding the feasability of these approaches.

Open Question 3: Recursion

It is unclear whether a templated method should itself be permitted to instantiate structs with const fields. Currently, go maintains the invariant that compiling a package should require only information from that package and direct dependencies, but not indirect dependencies. The simplest way to maintain that invariant is to disallow recursively-instantiated generic methods, at least across package boundaries. This also would make it easier to reason about how much code gets generated when a generic variable is instantiated, but discourages the creation of generic metapackages, and could require occasional code copying. If runtime implementation is practical, that could provide an option at the cost of performance, but it is unclear to me how often cross-package transitive generic type instantiation is actually needed. None of the samples used here need or would substantially benefit from recursive generic declaration.

Open Question 4: Allow methods to satisfy magic interfaces?

The "types" package in this proposal exists to create interface-like ways of referring to builtin go operators, but it raises the question of whether the "types.Orderable" magic interface should be satisfiable by custom user-defined types that have a .CompareTo() method or something similar. This is the opposite question to that normally asked about arithmetic types because this functionality would allow library authors to use standard go operators on custom types created by library users, without necessarily allowing library users to use standard operators on types provided by library authors. If this is not allowed, the author of a library like sort can simply provide two variants, one that operates on builtin types and the other that takes as a const func(x, y ELEM) bool field a comparison function, allowing the compiler to inline it into the sorting algorithms.

Open Question 5: How far to go with constants?

This proposal raises interesting questions about whether the constants currently in go can be extended in other ways to provide increased functionality with little increase in complexity. For example, allowing consts to take on any go type (global const maps), allowing functions to have const return values indicating that they are pure functions and allowing automatic compiletime code execution, or even redefining types themselves to just always be global consts (removing the need for separate type aliases). All of these would be compatible with and complementary to this proposal for const fields as generics, and they raise cool ideas about making deterministic computation a more first-class member of go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment