Skip to content

Instantly share code, notes, and snippets.

@gitschaub
Last active June 24, 2016 23:45
Show Gist options
  • Save gitschaub/d2936ade9dfcd9da2e830304b54f1753 to your computer and use it in GitHub Desktop.
Save gitschaub/d2936ade9dfcd9da2e830304b54f1753 to your computer and use it in GitHub Desktop.

Ultimate Go Training github link

Day 1 -- Primer and Introduction

Some Key Thoughts on Go

WYSIWYG. Convention over configuration. Productivity > Performance Know how much your code costs. No hidden overloads, overhead, etc. Type is key. Integrity first in all code Key Terms: Data-oriented design, mechanical sympathy (constrain to arch), zero-value

Go doesn't have casting, it has conversion. Have to be explicit, no "implicit casting". Even if two types have identical values, assignment/conversion must be explicit:

type key int32
var i int32
var k key
i = k //doesn't work

zero-value: All allocated memory is set to zero value (var declaration)

Structs

Structs as data

7 bytes of value, but 8 bytes allocated. Alignment is added to the bool (1 extra byte)

| bool | pad (1byte) | int16 | float32 |

int16 needs to be on a 2-byte memory addr (0, 2, 4, 6, etc). So pad 1.

If int16 was int32, we would need to pad 3 bytes (0, 4, 8, etc). This makes order important, because padding is wasteful.

Only matters if it represents pure data. Are we storing lots of these? Worry. If only making a handful, tradeoff for readability (e.g. grouping like values together)

Tip: Always order highest -> smallest

// example represents a type with different fields.
type example struct {
	flag    bool
	counter int16
	pi      float32
}

becomes

// example represents a type with different fields.
type example struct {
	pi      float32
	counter int16
	flag    bool
}

Declaring Structs

Go has no constructors, just literals

// Declare a variable of type example and init using
// a struct literal.
e2 := example{
	flag:    true,
	counter: 10,
	pi:      3.141592,
}

You can declare and initialize in one go (haha), an "anonymous struct" type:

e := struct {
	pi      float32
	counter int16
	flag    bool
}{
	flag:    true,
	counter: 10,
	pi:      3.141592,
}

Casting applies as before to NAMED structs.

type bill struct {
	pi      float32
	counter int16
	flag    bool
}
type lisa struct {
	pi      float32
	counter int16
	flag    bool
}

var b bill
var l lisa
b = l // doesn't work
b = bill(l) // DOES work

Note, bill must have the exact same layout as lisa. Can assign an anonymous (unnamed) struct:

b = e // works

Because e isn't a named type, it is a "schematic" of a struct, which matches bill

Tags in structs

Used for un/marashalling (reflection package)

type bill struct {
	flag	bool `json:"f"`
}

Note, this affects conversion:

type lisa struct {
	flag	bool
}

var l lisa
var b bill
b = l // doesn't work, because of tag

This is intended to be "fixed" in 1.8. So we can tag the same data differently across multiple structs, allowing us to marshal data into multiple formats quickly.

Pointers and Memory

Pointers are for sharing. No need to share, no need for pointer.

Standard memory layout: stack, heap.

Stack memory: single goroutine/function (not shared) Heap memory: multiple goroutines/functions (shared). Where "allocations" are made

Misc facts: Each thread has 1 MB stack memory by default. Goroutines start with 2 KB.

Pass by Value

Stack memory is allocated when goroutine begins. A stack frame is created (determined at compile time, max memory needed) on every entry into a function. Same as usual, stacks grow down.

Standard ref/deref:

func main() {
	count := 10 //allocated in 'main' stack frame

	// Display the "value of" and the "address of" count.
	println("Before:", count, &count)	
}

If you have a pointer to a struct, the . operator transparently dereferences field values:

type Foo struct {
	bar 	int
}

func main() {
	f := Foo{ bar: 10 }
	ptr := &f

	//these do the same thing
	f.bar = 11
	ptr.bar = 11
}

All information is passed BY VALUE in go. NO by reference:

func increment(inc int) {
	// Increment the "value of" inc.
	inc++
	println("Inc:   ", inc, &inc)
}

func main() {
	count := 10

	increment(count) //pass count by value
	println(count) //value not changed
}

Can pass an address (pointer) by value:

func increment(inc *int) { //*int is it's own type
	// Increment the value that the "pointer points to". (de-referencing)
	*inc++
	println("Inc   ", *inc, inc)
}

NOTE, you cannot do pointer arithmatic. Cannot inr/dcr pointer addresses. int* must point to a valid int in memory.

Allocation

Reduce allocations where possible. Example of variable escaping to stack.

type user struct {
	name 	string
	name	string
}

fucn main() {
	u1 := stayOnStack() //main receives it's own copy of value
	u2 := escapeToHeap()
}

// u does not escape stack
func stayOnStack() user {
	u := user {
		name: "Bill",
		email: "bill@ardanlabs.com",
	}

	return u
}

// u escapes to heap
func escapeToHeap() *user {
	u := user{
		name: "Bill",
		email: "bill@ardanlabs.com"
	}

	return &u
}

Escape analysis is performed at compile-time to determine if a value will continue to be referenced after a stack frame is removed. In escapeToHeap the address of the allocated user is passed to main, so it must be allocated to the heap. This happens when you "share up" the stack (call a function that allocates and returns an address to a value). "sharing down" the stack never needs to allocate to heap.

Building with go build -gcflags -m gives more in-depth feedback on compiler decisions, including heap allocations, inlining, etc. Building with go build -gcflags -S shows assembly output. Plan9? assembly?

Growing (copying) the Stack

Every function as a preamble which declares how much stack space it needs. Used to check if stack is large enough for function call, and grows stack accordingly.

Go uses continguous stacks so that all pointers can be updated relative to each other during a grow.

An example showing the stack growing. If we increase const size, a larger int array is allocated in each stack frame, forcing the stack to grow. The output will show the address of s changing when we have to grow the stack:

// Number of elements to grow each stack frame.
// Run with 10 and then with 1024
const size = 10

// main is the entry point for the application.
func main() {
	s := "HELLO"
	stackCopy(&s, 0, [size]int{})
}

// stackCopy recursively runs increasing the size
// of the stack.
func stackCopy(s *string, c int, a [size]int) {
	println(c, s, *s)

	c++
	if c == 10 {
		return
	}

	stackCopy(s, c, a)
}

Output w/ size == 10:

0 0x10327f80 HELLO
1 0x10327f80 HELLO
2 0x10327f80 HELLO
3 0x10327f80 HELLO
4 0x10327f80 HELLO
5 0x10327f80 HELLO
6 0x10327f80 HELLO
7 0x10327f80 HELLO
8 0x10327f80 HELLO
9 0x10327f80 HELLO

Output w/ size == 1024:

0 0x10347f78 HELLO
1 0x1034ff70 HELLO //stack grew
2 0x1034ff70 HELLO
3 0x1034ff70 HELLO
4 0x1034ff70 HELLO
5 0x1035ff68 HELLO //stack grew
6 0x1035ff68 HELLO
7 0x1035ff68 HELLO
8 0x1035ff68 HELLO
9 0x1035ff68 HELLO

Garbage Collection

GC has change a lot across different releases. This is how 1.6 works.

Goal: Reduce the pressure on the garbage collector.

GC has one knob:

WB == write barrier. While active, all memory writes pass through barrier (so that GC stays informed during sweep). These writes are marked as "uncertain" or "grey". STW == stop the world. No memory writes.

Steps for GC. Mostly standard mark-and-sweep, but with a third color added (grey) so that we don't have to STW the world for an extended time: 1. Off: GC is disabled, pointer writes are just direct memory writes: *slot = ptr 2. Stack scan: WB starts. Briefly STW to collect ptrs from globals & goroutine stacks. 3. Mark: Mark objects (turn from white to black) and follow ptrs until ptr queue is empty. 4. Mark termination: STW starts. Rescan global & changed stacks, finish marking (for grey objects), shrink stacks. 5. Sweep: WB/STW off. Reclaim unmarked objects (white objects) as needed. Adjust GC pacing for next cycle 6. Off: turn off until next cycle.

Constants

One of Jean Paul's most favorite things. Go has a novel approach.

Constants in Go is a value that only exists at compile time. Only numeric types (int, bool, float, etc), can be constants. The minimum precision for a constant is 256 bits, geared for representing high-precision values.

Declaring

Can declare a constant of a type and a kind. If it has a type, it has to live by that type's rules (precision, etc). If it has a kind, it can be implicitly converted to a type at compile time:

// Untyped constants recieve a 'kind'
const ui = 12345	// kind: integer
const uf = 3.141592	// kind: floating-point

// Typed constants use the constant type system, but precision is restricted
// based on declared type
const ti int = 12345		// type: int
const tf float64 = 3.141592	// type: float64

const bigInt = 2384234729587293475029347509238745092834 // allowed
//var bigInt int64 = 98729387420973094872304918273094172384 // compiler error, overflows

// this statement fails at compile time. fmt.Println needs to represent bigInt
// as an integer-type, but overflows
fmt.Println(bigInt)

Constants in declarations:

// Variable answer will be of type float64.
var answer = 3 * 0.333 // KindFloat(3) * Kind Float(0.333), but >=256-bit precision

// Constant third will be of kind floating-point
const third = 1 / 3.0 // KindFloat(1) / KindFloat(3.0)

// Constant zero will be of kind integer.
const zero = 1 / 3 // KindInt(1) / KindInt(3)

// Const arithmetic between type and untyped constatnts. Must have like types
const one int8 = 1
const two = 2 * one // int8(2) * int8(1)

Declaring types

Use type names ONLY IF you need a new representation of information. For example:

type duration int64

Duration is not an int64 value. It is it's own type that holds 64 bits of integer-like data.

var d duration
d := duration(1000)
nanosecond := int64(10)
d = nanosecond // compiler error, d isn't an int64, it's a duration

An example of constants from the time package:

type Duration int64

const (
	Nanosecond 	Duration = 1
	Mircosecond			 = 1000 * Nanosecond
	...
)

// Add returns the time t+d. Function only accepts  explicit Duration types, not integers
func (t Time) Add(d Duration) Time

// fiveSeconds is a typed constant of type Duration.
const fiveSeconds = 5 * time.Second // time.Duration(5) * time.Duration(1000000000)

now := time.Now()
// Subtract 5 nanoseconds from now time?
lessFiveNanoseconds := now.Add(-5) // -5 is interpreted as a Duration at compile time
// Subtract 5 seconds using a declared constant
lessFiveSeconds := now.Add(-fiveSeconds)
minusFive := int64(-5)
lessFiveNanoseconds = now.Add(minusFive) // FAILS! Compiler error, minusFive is not of type Duration

Scope

Some notes on scope in Go.

func main() {
	var u *user
	
	// u, err are attached to the if statement scope
	if u, err := retrieveUser("sally"); err != nil {
		fmt.Println(err)
		return
	}

	// u is a zero-pointer
	fmt.Printf("%+v\n", *u)

	// INSTEAD
	u, err := retrieveUser("sally")
	if err != nil {
		fmt.Println(err, u)
		return
	}

	// Display the user profile
	fmt.Printf("%+v\n", *u)
}

Functions

Returns

If returning multiple values, return the zero value, not a variable, if you are going to ignore a return value. Example:

// retrieveUser retrieves the user document for the specified
// user and returns a pointer to a user type value.
func retrieveUser(name string) (*user, error) {

	// Make a call to get the user in a json response.
	r, err := getUser(name)
	if err != nil {
		return nil, err //don't care about the user, return nil
	}

	// Unmarshal the json document into a value of
	// the user struct type.
	var u user
	err = json.Unmarshal([]byte(r), &u)
	return &u, err
}

Data-Oriented Design

If you don't understand the data you are working with, you don't understand the problem you are trying to solve.

Data transformation is the heart of solving problems. If your data is changing, the problem you are solving is changing.

Uncertainty about the data is not a license to guess, but a directive to STOP and learn more.

Coupling data together and writing code that produces predictable access patterns to the data will be the most performant.

Changing data layouts can yield more significant performance improvements than changing just the algorithms.

If performance matters, you must have mechanical sympathy for how the hardware and operating system work. Write code that has predictable access patterns to memory.

Ex: Access to main memory can be as large as 107 cycles. Every page miss incurs a huge cost. Object-oriented design inherently creates linked lists.

Arrays

A contiguous block of memory. Commonly used for iteration, which is good for prediction. Most important data structure from a hardware perspective.

Arrays are well-defined at compile time (built-in type). Size must be fixed.

var strings [3]string //array of strings with 3 elements
strings[0] = "Apple"
strings[1] = "Orange"
strings[2] = "Plum"

// in one line
numbers := [4]int{1, 2, 3, 4}

// let the compiler determine the length
numbers2 := [...]int{1, 2, 3, 4, 5}

// shorthand to repeat a value in initialization
numbers3 := [...]int{4:0, 4:1, 4:0} //4 0s, then 4 1s, then 4 0s

An array of a certain size is considered it's own type:

var five [5]int

four:= [4]int{10, 20, 30, 40}

five = four // compiler error, cannot assign [4]int to [5]int

This allows arrays to be allocated in the stack (they can be added to a well-defined stack frame)

Slices

Slices are a part of Go's reference types: slices, maps, interface, channels, functions. Only pointers and reference types can be nil.

"The most important data structure in Go". Allows us to work with arrays in a "productive way".

Make

Used with slices, channels, and maps. "Makes" the header value for that type, initializes the header value and the backing structure. Making a slice:

slice := make([]string, 5)
// then works like an array
slice[0] = "Apple"
slice[1] = "Orange"
slice[2] = "Plum"

// cannot access an index beyond the slice's length
slice[4] = "Ehhhh, runtime error"

fmt.Println(slice)

A three-word data structure. Pointer to backing array (contiguous memory), length of the array, and capacity of the array (always >= length).

DON'T TAKE THE REFERENCE OF A SLICE. Just copy the value of the slice header, for goodness sakes.

DON'T MAKE A SLICE OF POINTERS. You want data in contiguous blocks.

Maps

Day 2 -- Slices, Maps, Interfaces, and more

Data-Oriented Design (cont.)

Arrays

Recap, most important to hardware (optimal caching).

Slices

Most important for productivity, while keeping the predictable access of arrays.

// Create a slice with a length of 5 elements and a capacity of 8.
slice := make([]string, 5, 8)
slice[0] = "Apple"
slice[1] = "Orange"
slice[2] = "Banana"
slice[3] = "Grape"
slice[4] = "Plum"

fmt.Printf("Length[%d] Capacity[%d]\n", len(slice), cap(slice))

var data []string // a 'nil' slice of strings
data := []string{} // an empty slice of strings

Can only access elements up to "length", and there are "capacity" elements allocated in the backing array.

Prefer 'nil' slices on returns, unless you need to represent the slice as an empty list (for un/marshalling).

Slices can be appended to:

data = append(data, "another element")

Appends "another element" to the slice of strings. If len < cap, increase len, and add the element. If len == cap, grow the slice (copy the backing array into a larger backing array), then add the new element. The go runtime determines how much to increase the size of the backing array.

Rule of thumb, don't create a slice of pointers. Slices hold data or values.

Can create slices that use the same backing data:

// Create a slice with a length of 5 elements and a capacity of 8.
slice1 := make([]string, 5, 8)
slice1[0] = "Apple"
slice1[1] = "Orange"
slice1[2] = "Banana"
slice1[3] = "Grape"
slice1[4] = "Plum"

// Take a slice of slice1. We want just indexes 2 and 3.
// Parameters are [starting_index : (starting_index + length)]
slice2 := slice1[2:4]

// Can make a slice specifying a capacity
slice2 := slice1[2:4:4] //len 2, cap 2

slice2[0] = "CHANGED"  // change value in backing array

slice2 = append(slice2, "OVERWRITE") // append to slice2 overwrites the next element past it's length (slice1[5] in this example)

Slice2 uses the same backing array as slice1. slice2[0] == slice1[2]. The capacity of slice2 is cap(slice1) - starting_index(slice2). If either slice has to grow, the other slice header will not reflect the changes. A new backing array is created for the growing slice, the old backing array remains for the other slice.

A 'nil' slice is still valid:

var myNil []int
myNil = append(myNil, 1) // works fine, myNil is a valid slice
Slices and Strings

All strings are valid UTF8 sequences, stored in bytes. If we iterate over a string:

// Declare a string with both chinese and english characters.
s := "世界 means world"

// Iterate over each character in the string.
for i := range s {
	fmt.Printf("Index: %d\n", i)
}

Output:

Index: 0
Index: 3
Index: 6
Index: 7
Index: 8
Index: 9
Index: 10
Index: 11
Index: 12
Index: 13
Index: 14
Index: 15
Index: 16
Index: 17

Each rune may be 1-4 bytes.

Maps

Simple key-value structure.

// user defines a user in the program.
type user struct {
	name    string
	surname string
}

// Declare and make a map that stores values
// of type user with a key of type string.
users := make(map[string]user)

// Add key/value pairs to the map.
users["Roy"] = user{"Rob", "Roy"}
users["Ford"] = user{"Henry", "Ford"}
users["Mouse"] = user{"Mickey", "Mouse"}
users["Jackson"] = user{"Michael", "Jackson"}

// Iterate over the map.
for key, value := range users {
	fmt.Println(key, value)
}

NOTE: When you "range" over a map, the key are returned in a random order.

You can initialize your map directly:

// Declare and initialize the map with values.
users := map[string]user{
	"Roy":     {"Rob", "Roy"},
	"Ford":    {"Henry", "Ford"},
	"Mouse":   {"Mickey", "Mouse"},
	"Jackson": {"Michael", "Jackson"},
}

Any value type is acceptable, but you cannot use any type for the key, it must be hashable:

type users []user

// Declare and make a map uses a slice of users as the key.
u := make(map[users]int)

// compiler error: invalid map key type users

You cannot define your own hasing function.

Methods, Interfaces, and Embedding

E.g. how to deal with change in your data. Need to build thin layers of abstraction so you can react to change without large changes to your code.

Methods

A function is called a 'method' when it is declared with a receiver. A receiver attaches behavior to types. In this example, we implement a method with a user receiver:

// user defines a user in the program.
type user struct {
	name  string
	email string
}

// notify implements a method with a value receiver.
func (u user) notify() {
	fmt.Printf("Sending User Email To %s<%s>\n",
		u.name,
		u.email)
}

There are two types of receivers: value receivers and pointer receivers. The previous example was a value receiver. A value receiver receives a copy of the calling structure, a pointer receiver shares the calling data structure.

// changeEmail implements a method with a pointer receiver.
func (u *user) changeEmail(email string) {
	u.email = email
}

CONSISTENCY RULES THE DAY. Only use one type of receiver for any type.

Invoking a method:

// Values of type user can be used to call methods
// declared with a value receiver.
bill := user{"Bill", "bill@email.com"}
bill.notify()

// Pointers of type user can also be used to call methods
// declared with a value receiver.
lisa := &user{"Lisa", "lisa@email.com"}
lisa.notify()


// Values of type user can be used to call methods
// declared with a pointer receiver.
bill.changeEmail("bill@hotmail.com")
bill.notify()

// Pointers of type user can be used to call methods
// declared with a pointer receiver.
lisa.changeEmail("lisa@hotmail.com")
lisa.notify()

This demonstrates, if you mix value/pointer receivers, that the calling value is referenced/dereferenced accordingly when a method is invoked.

You can make methods on arbitrary types:

type duration int64

const (
	nanosecond  duration = 1
	microsecond          = 1000 * nanosecond
	millisecond          = 1000 * microsecond
	second               = 1000 * millisecond
	minute               = 60 * second
	hour                 = 60 * minute
)

// setHours sets the specified number of hours.
func (d *duration) setHours(h float64) {
	*d = duration(h) * hour
}

// hours returns the duration as a floating point number of hours.
func (d duration) hours() float64 {
	hour := d / hour
	nsec := d % hour
	return float64(hour) + float64(nsec)*(1e-9/60/60)
}

myDuration := 1000 * hour
_ := myDuration.hours()

Each method has a function pointer and a data pointer. For pointer-receivers, the data pointer points to a pointer. For value-receivers, the data pointer points to a copy of the calling structure. Example of how this matters:

// Declare a function variable for the method bound to the d variable.
// The function variable will get its own copy of d because the method
// is using a value receiver.
f1 := d.displayName

// Call the method via the variable.
f1()

// Change the value of d.
d.name = "Lisa"

// Call the method via the variable. We don't see the change.
f1()

// =========================================================================

fmt.Println("\nCall Pointer Receiver Method with Variable:")

// Declare a function variable for the method bound to the d variable.
// The function variable will get the address of d because the method
// is using a pointer receiver.
f2 := d.setAge

// Call the method via the variable.
f2(45)

// Change the value of d.
d.name = "Joan"

// Call the method via the variable. We see the change.
f2(45)

Recommended layout: type -> factory functions -> methods.

Interfaces

Interfaces with the concept of composition, gives us the ability to create thin layers of abstractions. Provides polymorphism. Interfaces are reference types, e.g. there is some header information, and has a 'nil' zero value. It is a two-word structure. The first word is a pointer into the "ITable" which stores the type of the concrete-type and the method pointer of that type that implements the interface, the second word points to the concrete-type value that implements the interface (either a value or pointer).

Interfaces only declare behavior (no state). To implement an interface, a type must just define all of the intefaces methods.

// reader is an interface that defines the act of reading data.
type reader interface {
	read(b []byte) (int, error)
}

// file defines a system file.
type file struct {
	name string
}

// read implements the reader interface for a file.
func (file) read(b []byte) (int, error) {
	s := "<rss><channel><title>Going Go Programming</title></channel></rss>"
	copy(b, []byte(s))
	return len(s), nil
}

You can use interfaces to create polymorphic functions:

// retrieve can read any device and process the data.
func retrieve(r reader) error {
	data := make([]byte, 50)

	fmt.Println(len(data))
	len, err := r.read(data)
	if err != nil {
		return err
	}

	fmt.Println(string(data[:len]))
	return nil
}

retrieve accepts any value/pointer of a concrete type that implements the reader interface.

// read implements the reader interface for a network connection.
func (pipe) read(b []byte) (int, error) {
	s := `{name: "bill", title: "developer"}`
	return copy(b, []byte(s)), nil
}

func main() {

	// Create two values one of type file and one of type pipe.
	f := file{"data.json"}
	p := pipe{"cfg_service"}

	// Call the retrieve funcion for each concrete type.
	retrieve(f)
	retrieve(p)
}

A regular naming convention for single-method intefaces, append 'er' or 'or' after the method name, ex read method -> reader inteface, write method -> writer inteface, select method -> selector interface.

Method Sets

Dictates which methods belong to a value type, and which methods belong to a pointer type. These dem rules:

1. For values of type T, ONLY methods of value receivers belong to the type 
2. For pointers of type T, methods with value AND pointer receivers belong to the type

Why? Integrity. You can't always guarantee that you can get the address of a value that implements an interface.

// duration is a named type with a base type of int.
type duration int

// notify implements the notifier interface.
func (d *duration) notify() {
	fmt.Println("Sending Notification in", *d)
}

func main() {
	duration(42).notify()

	// ./example3.go:18: cannot call pointer method on duration(42)
	// ./example3.go:18: cannot take the address of duration(42)
}

In this case, *duration implements notify, but duration(42) does not have an address (it's a constant), so it does not.

Embedding

Embedding is quasi-inheritance. You can add an inner-type to types, promoting it's state and methods to the outer-type:

// user defines a user in the program.
type user struct {
	name  string
	email string
}

// notify implements a method that can be called via
// a pointer of type user.
func (u *user) notify() {
	fmt.Printf("Sending user email To %s<%s>\n",
		u.name,
		u.email)
}

// admin represents an admin user with privileges.
type admin struct {
	user  // Embedded Type
	level string
}

func main() {

	// Create an admin user.
	ad := admin{
		user: user{
			name:  "john smith",
			email: "john@yahoo.com",
		},
		level: "super",
	}

	// We can access the inner type's method directly.
	ad.user.notify()

	// The inner type's method is promoted.
	ad.notify()
}

Package Oriented Design

How to write APIs in Go. How to organize your source code.

Every package is a reusable library. Each library provides one piece and only one piece.

The biggest problem that every team has when starting with Go. You need 4 packages (as a start): 1. Log: Where do I send logs? 2. Config: What happens when configuration changes? How is it deployed? 3. Trace: How to trace a request through the system. 4. Metrics: Assess the health of the system

Every repo is a project. If the project is building binaries, it has three top-level folders: - vendor: All of the packages that are being used, but not owned by this project. It must OWN (not lease) all of the source code. You should only have to download one repo. Use go-vendor/godep to keep packages up to date. Don't like glide because you lease. - cmd: Has a subfolder for every product we are building. All the information is here for building the package/binary - internal: Packages that can only be used internal to the project. The compiler specifically denies importing any packages under an "internal" directory. Can only be imported by code within the project itself.

Have a 'kit' project ardenlabs example. Common tools that are used across many projects. Packages here need to have the highest level of decoupling. "The only thing you are allowed to import is the standard library"

Rules of thumb: no capital letters or underscores in folder names. Every package has one source code file that is named after it.

Identifiers

Identifiers in a packaged are either exported or unexported.

If the first letter of any identifier is a capital letter, it is exported and can be viewed outside of its package.

If the first letter of any identifier is a lowercase letter, it is unexported and cannot be viewed from outside of its package.

Rules of thumb: if you are returning a type out of a package, make sure it is exported. In struct fields, seperate exported, and unexported structs.

Importing

An import is a physical location on disk, relative to your GOPATH. This includes both GOPATH/src/ AND your vendor folder.

It is idiomatic to separate std library imports from everything else. Some like std library, internal packages, and external vendored packages.

Go Gotchas

Must save your go files as UTF8.

Only two aliases in Go (nothing else can be aliased). Rune == int32, byte == uint8.

Day 3 -- Composition, Error Handling, and Concurrency

Composition and Decoupling

Interface composition is key to decoupling packages in our code.

Notes: Day 1, solve the problem. After, revisit and decouple your code.

Rules of thumb:

  • Interfaces provide the highest form of decoupling when the concrete types used to implement them can remain opaque.
  • Decoupling means reducing the amount of intimate knowledge code must have with concrete types.
  • Interfaces with more than one method has more than one reason to change.
  • You must do your best to understand what could change and decouple those aspects of your code.
  • Uncertainty about change is a directive to STOP, not to GUESS.

Interface Pollution

  • Don't use an interface for the sake of using an interface.
  • Don't use an inteface to generalize an algorithm
  • Unless the user needs to provide an implementation or you have multiple implementations, question
  • Don't export an interface unless your user needs it. This includes interfaces for internal testing. Users can delcare their own interfaces.
  • If it's not clear how an abstraction makes the code better, it probably doesn't.

If using an empty interface and it isn't being used for un/marshalling, you are doing it wrong.

Struct Composition

Package-Oriented Design

  • Start with a project that contains all the source code you need to build the products and services the project owns

Error Handling

You have the power to make people very happy or very miserable. Error handling is about respecting your user to make an informed decisioned about the integrity of the software.

Errors in go are dealt with by an interface. We want to maintain decoupling of errors, so changes don't cascade across projects.

The default error type is an errorString:

type error interface {
	Error()	string
}

type errorString struct {
	s string
}

Implementations of the error interface should have pointer receivers. This supports common idioms for handling returned errors:

func (e *errorString) Error() string {
	return e.s
}

Sometimes to handle an error, you just need to log it. Errors should have enough context. This is a common idiom for error handling:

if err := webCall(); err != nil {
	fmt.Println(err)
	return
}

err != nil translates to "do we have a concrete-value in our error?". If no concrete-value was given to err, then we have a zero-value interface.

Error variables should be at the top of the source code file being used. If they are being used in multiple places, put them at the top of the file named after the package.

Error Variables

If you have more than one type of error, use error variables:

var (
	// ErrBadRequest is returned when there are problems with the request.
	ErrBadRequest = errors.New("Bad Request")

	// ErrMovedPermanently is returned when a 301/302 is returned.
	ErrMovedPermanently = errors.New("Moved Permanently")
)

// webCall performs a web operation.
func webCall(b bool) error {
	if b {
		return ErrBadRequest
	}

	return ErrMovedPermanently
}

This should be enough in most cases for error handling. If you need more context, go ahead and implement the interface to create a new error type:

// An UnmarshalTypeError describes a JSON value that was
// not appropriate for a value of a specific Go type.
type UnmarshalTypeError struct {
	Value string       // description of JSON value
	Type  reflect.Type // type of Go value it could not be assigned to
}

// Error implements the error interface.
func (e *UnmarshalTypeError) Error() string {
	return "json: cannot unmarshal " + e.Value + " into Go value of type " + e.Type.String()
}

The TYPE of the error provides the additional context. The .(type) builtin only works in switch statements:

switch e := err.(type) {
case *UnmarshalTypeError:
	fmt.Printf("UnmarshalTypeError: Value[%s] Type[%v]\n", e.Value, e.Type)
case *InvalidUnmarshalError:
	fmt.Printf("InvalidUnmarshalError: Type[%v]\n", e.Type)
default:
	fmt.Println(err)
}
return

If you make your own error types, you are walking away from decoupled design. Sometimes things can't be decoupled.

Concurrency

Nasty stuff. Concurrency is about managing a lot of things at once. Parallelism is about doing a lot of things at once. Responsibility for concurrency does not fall on Go. Go did not make maintaining concurrnecy easier.

How are we sympathetic when writing concurrent applications. Less is always more. We have to work around scheduling to get the best performance.

The operating system's job is to take a thread, schedule it to a score, and give as many threads a share of CPU time as possible.

Context switches are expensive. The more time spent switching between threads, the less work being done.

Rules of thumb: From the beginning of testing, your software needs to shut down cleanly.

Goroutines

A goroutine can be any function or method. Simply drop go in front of a function:

go func() {
	uppercase()
}

Note, this creates a closure in the local scope.

A goroutine as 2KB memory to start.

Goroutines run within the Go scheduler to share a pool of threads on the operating system. The scheduler will create a deadlock panic if all goroutines are asleep at the same time.

Go Scheduler

When a Go program launches, it is given a logical processor for each physical core on the machine. It is a cooperating scheduler that runs in user mode, but tries to look/feel like a preemptive scheduler.

The Global Run Queue for threads that are ready to be run. The Local Run Queue holds goroutines that are waiting for time to run on a thread.

If a goroutine makes a call that would cause it to go to sleep, the scheduler creates a new thread to hold the sleeping routine, then swaps in a waiting goroutine from the LRQ. When the sleeping goroutine reactivates, it will be assigned back to the oringal thread.

Day 4 -- Concurrency, Patterns, Testing

Concurrent cont.

Recap. Concurrency == managing a lot of things, parallelism == doing a lot of things. You cannot put it all on the runtime/scheduler.

Your software needs to be able to start up/shut down cleanly.

You are not allowed to make a goroutine if you don't know how or when it will be terminated.

When you need to access shared state, use atomic/mutexes. Atomic if 4-8 bytes of memory and are fastest. Mutexes can do more, but are slower (regular mutex > RWMutex).

Channels

Orchestration == channels. Do not treat channels like a queue. That is not it's purpose. It is essentially passing data across program boundaries.

Channels provide guarantees in your code for when data is passed from one goroutine to another. Channels can be buffered or unbuffered.

For unbuffered channels, there is 100% guarantee that the data has been received. A sender/receiver is blocked until the data has exchanged hands.

For buffered channels, there is not 100% guarantee. A good analogy: You have a letter for a coworker's husband. At work, you visit your coworker's office. She isn't there, but her laptop is open on her desk. You leave the letter propped up against her laptop, trusting she will return and find it later. Buffered channels are meant for continuity, not performance.

You shouldn't have a buffered channel with a size greater than one.
If I see you have a buffer greater than one, we're taking a walk. - Bill

Use the buffer size to apply back-pressure, set up barriers to limit the rate of data input. A little back-pressure is good, means you're running hot. A lot of back-pressure can blow up a system. If you need to handle more data, you can create more goroutines to handle it, don't make your buffers bigger. Good to analyze how many goroutines are blocked on a particular channel. You cannot have a channel if you don't know what will happen when a send blocks.

Buffer bloat. If you insist on not discarding data, buffers will become congested. Instead of buffering and retaining every bit of data, build buffers for continuity. Make them small. Make it about continuity, not performance.

CHANNELS ARE NOT QUEUES.

Once a channel is closed, it cannot be reopened. Sending on a closed channel will cause a panic. Receiving on a closed channel will return immediately.

// Create an unbuffered channel.
court := make(chan int)

// Wait for the ball to be hit back to us.
ball, ok := <-court
if !ok {

	// If the channel was closed we won.
	fmt.Printf("Player %s Won\n", name)
	return
}

ok indicates that the receive (<-) came from a closed channel.

To create a buffered channel:

// Set the number of routines and inserts.
const routines = 10
const inserts = routines * 2

// Buffered channel to receive information about any possible insert.
ch := make(chan result, inserts)

We create a buffered channel when we know x number of goroutines will be by to pick up x buffered results.

You have two choices when shutting your program down. os.Exit(int) when you just need to shutdown, and panic(err) when you need a stack trace (something went wrong).

Defers create allocations on the heap. However, defer automatically cleans up this allocation. Use it if it is going to make your code simpler.

Pools

To manage many concurrent tasks, making requests, we can use pooling to control the number of connections being made. Here is an example of a goroutine pool that limits work to a number of goroutines

type Worker interface {
	Work()
}

type Task struct {
	work 	chan Worker
	wg 		sync.WaitGroup
}

func New(maxGoroutines int) *Task {
	t:= Task{
		work: make(chan Worker),
	}

	t.wg.Add(maxGoroutines)
	for i := 0; i < maxGoroutines; i++ {
		go func() {
			//loop terminates when t.work closes
			for w := range t.work {
				w.Work()
			}
			t.wg.Done()
		}()
	}

	return &t
}

func (t *Task) Do(w Worker) {
	pending++ //could include to know the back-pressure for this pool
	//this blocks until a goroutine is available
	t.work <- w
	pending--
}

func (t *Task) Shutdown() {
	close(t.work)
	t.wg.Wait()
}

Product level example from Bill's kit

Context

Will be std library in 1.7. Context is passed from call to call to give the chain context:

type Worker interface {
	Work(context interface{}, id int)
}

type doWork struct {
	context 	interface{}
	do 			Worker
}

// Stat contains information about the pool. DON'T BE BLIND SHEEPLE
type Stat struct {
	Routines    int64 // Current number of routines.
	Pending     int64 // Pending number of routines waiting to submit work.
	Active      int64 // Active number of routines in the work pool.
	Executed    int64 // Number of pieces of work executed.
	MaxRoutines int64 // High water mark of routines the pool has been at.
}


// OptEvent defines an handler used to provide events.
type OptEvent struct {
	Event func(context interface{}, event string, format string, a ...interface{})
}

// Config provides configuration for the pool.
type Config struct {
	MinRoutines func() int // Initial and minimum number of routines always in the pool.
	MaxRoutines func() int // Maximum number of routines we will ever grow the pool to.

	// *************************************************************************
	// ** Not Required, optional                                              **
	// *************************************************************************

	OptEvent
}

Testing and Benchmarking

Testing

Doesn't matter exactly what pattern you use, but be consistent. Tests are about validating that the API works for the user. Recommend focusing on the exported API when doing tests. As such, try not to make a separate test package.

Go's test tooling is straight forward. To add tests for a particular file append _test to the file name. Ex: main.go -> main_test.go.

To test a specific function MyFunction, create another function with Test appended: TestMyFunction. Even if it's an unexported function myFunction, the next letter must be capitalized: TestMyFunction.

func TestDownload(t *testing.T) {
	t.Log("given the need to test downloading content.")

	resp, err := http.Get(url)
	//Use Fatal or Fatalf to flag a failed test and move on
	if err != nil {
		t.Fatalf("\t%s\tShould be able to make the Get call : %v", failed, err)
	}
}

You can run tests using the builtin tool go test. Vanilla go test will just give pass/fail, adding -v adds all logging output.

Benchmarking

Same as with testing. Benchmarks go in the _test file. To benchmark a function add Benchmark in from of the name (same rules as before):

var fs string

func BenchmarkSprintf(b *testing.B) {
	number := 10
	var s string

	for i := 0; i < b.N; i++ {
		s = fmt.Sprintf("%d", number)
	}

	fs = s
}

Notice the global fs string. We include this to guarantee that the compiler doesn't optimize the function out of the code.

Profiling

Profiling a profile.go executable. Examples profiling CPU and memory:

go test -run none -bench . -cpuprofile cpu.out
go tool pprof profile.test cpu.out

go test -run none -bench . -memprofile mem.out
go tool pprof -alloc_space profile.test mem.out
go tool pprof -alloc_space -base base.heap memory_trace current.heap

To produce garbage collection traces, set GODEBUG

export GODEBUG=gctrace=1

Inside the pprof tool, quick two commands.

  • top displays the top users of memory/cpu
  • list <func> shows the code in question and marks usage line-by-line

Fuzzing

Getting your code to panic. Fuzzing throws mutations of input into your program until it causes something to crash. Create a file called fuzzer.go and add a function called Fuzz:

func Fuzz(data []byte) int {
	//call the code you want to try to get to panic
	r, _ := http.NewRequest("POST", "/process", bytes.NewBuffer(data))
	w := httptest.NewRecorder()
	http.DefaultServerMux.ServeHTTP(w, r)

	if w.Code != 200 {
		// Report the data that produced this error as not interesting
		return 0
	}

	// Report the data that did not cause an error as interesting
	return 1
}

To use fuzz, get the fuzz tool

go get github.com/dvyukov/go-fuzz/go-fuzz
go get github.com/dvyukov/go-fuzz/go-fuzz-build

Create a directories in your package directory: ./workdir/corpus. Add a file input.txt with sample input:

ADMxxBill,ADM42Lisa,DEV35John,USR46Eduardo

Then you have to build the fuzz support for your package

go-fuzz-build path/to/your/gopath/package

It will build an api-fuzz.zip file. Run the fuzz

go-fuzz -bin=./api-fuzz.zip -dup -workdir=workdir/corpus

If it crashes, it will add a few files under ./workdir/corpus/crashers: input that caused the function to crash, the output, and quoted. Use the contents to identify the input that caused the panic, and rerun your tests to cover the case.

Remove the old crashers folder, and suppresions folder if desired. Also delete the old api-fuzz.zip file. Then rerun fuzzing.

Debugging

Godebug allows us to enable various debugging options. For example, to view the Go scheduler trace every second:

export GODEBUG=goschedtrace=1000

Stack traces are important. Surprise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment