Skip to content

Instantly share code, notes, and snippets.

@faiface
Created October 2, 2018 19:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save faiface/ae63d139fbf4e239b9e98b34c2e3b0d1 to your computer and use it in GitHub Desktop.
Save faiface/ae63d139fbf4e239b9e98b34c2e3b0d1 to your computer and use it in GitHub Desktop.

Problem: the "context" package

More than a year ago, I wrote a blog post titled Context Should Go Away For Go 2 which received a fair amount of support and response. In said blog post, I described reasons why the "context" package is a bad idea because it's too infectious.

As explained in the blog post, the reason why "context" spreads so much and in such an unhealthy fashion is because it solves the problem of cancellation of long-running procedures.

I promised to follow the blog post (which only complained about the problem) with a solution. Considering the recent progress around Go 2, I decided it's the right time to do the follow up now. So, here it is!

Solution: bake cancellation into Go 2

My proposed solution is to bake cancellation into the language and thus avoiding the need to pass the context around just to be able to cancel long-running procedures. The "context" package could still be kept for the purpose of goroutine-local data, however, this purpose does not cause it to spread, so that's fine.

In the following sections I'll explain how exactly the baked-in cancellation would work.

One quick point before we start: this proposal does not make it possible to "kill" a goroutine - the cancellation is always cooperative.

Examples to get the idea

I'll explain the proposal in a series of short, very contrived examples.

We start a goroutine:

go longRunningThing()

In Go 1, the go keyword is used to start a goroutine, but doesn't return anything. I propose it should return a function which when called, cancels the spawned goroutine.

cancel := go longRunningThing()
cancel()

We started a goroutine and then cancelled it immediately.

Now, as I've said, cancellation must be a cooperative operation. The longRunningThing function needs to realize its own cancellation on request. How could it look like?

func longRunningThing() {
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    }
}

This longRunningThing function does not cooperate. It takes 5 seconds no matter what. That's the first takeaway: cancellation is optional - if a goroutine does not support cancellation, it remains unaffected by it. Here's how we add the support:

func longRunningThing() {
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    cancelling:
        fmt.Println("cancelled")
    }
}

I propose the select statement gets an additional branch called cancelling (a new keyword) which gets triggered whenever the goroutine is scheduled for cancellation, i.e. when the function returned from the go statement gets called.

The above program would therefore print:

cancelled

What if the long-running thing spawns some goroutines itself? Does it have to handle their cancellation explicitly? No, it doesn't. All goroutines spawned inside a cancelled goroutine get cancelled first and the originally cancelled goroutine starts its cancellation only after all its 'child' goroutines finish.

For example:

func longRunningThing() {
    go anotherLongRunningThing()
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    cancelling:
        fmt.Println("cancelled")
    }
}

func anotherLongRunningThing() {
    select {
    case <-time.After(3 * time.Second):
        fmt.Println("child finished")
    cancelling:
        fmt.Println("child cancelled")
    }
}

This time, running:

cancel := go longRunningThing()
cancel()

prints out:

child cancelled
cancelled

This features is here because the child goroutines usually communicate with the parent goroutine. It's good for the parent goroutine to stay fully intact until the child goroutines finish.

Now, let's say, that instead of in another goroutine, longRunningThing needs to execute anotherLongRunningThing three times sequentially, like this (anotherLongRunningThing remains unchanged):

func longRunningThing() {
    anotherLongRunningThing()
    anotherLongRunningThing()
    anotherLongRunningThing()
}

This time, longRunningThing doesn't even handle the cancellation at all. But, cancellation propagates to all nested calls. Cancelling this longRunningThing would print:

child cancelled
child cancelled
child cancelled

All anotherLongRunningThing calls got cancelled one by one.

What if anotherLongRunningThing can fail, or just wants to signal it was cancelled instead of finishing successfully? We can make it return an error:

func anotherLongRunningThing() error {
    select {
    case <-time.After(3 * time.Second):
        return nil
    cancelling:
        return errors.New("cancelled")
    }
}

Now we update the longRunningThing to handle the error (using the new error handling proposal):

func longRunningThing() error {
    check anotherLongRunningThing()
    check anotherLongRunningThing()
    check anotherLongRunningThing()
    return nil
}

In this version, longRunningThing returns the first error it encounters while executing anotherLongRunningThing three times sequentially. But how do we receive the error? We spawned the function in a goroutine and there's no way to get the return value of a goroutine in Go 1.

Here comes the last thing I propose. I propose that the function returned from the go statement has the same return values as the function that was set to run in the goroutine. So, in our case, the cancel function has type func() error:

cancel := go longRunningThing()
err := cancel()
fmt.Println(err)

This prints:

cancelled

However, if we waited 10 seconds before cancelling the goroutine (longRunningThing takes 9 seconds), we'd get no error, because the function finished successfully:

cancel := go longRunningThing()
time.Sleep(10 * time.Second)
err := cancel()
fmt.Println(err)

Prints out:

<nil>

And lastly, say we have a function called getGoods which contacts some service, gets some goods back, and sends them on a channel. We only want to wait for the goods for 5 seconds, no more. Here's how we implement a timeout:

goods := make(chan Good)
cancel := go getGoods(goods)

select {
case x := <-goods:
    // enjoy the goods
case <-time.After(5 * time.Second):
    err := cancel()
    return errors.Wrap(err, "timed out")
}

And that is the end of this series of short examples. I've shown all of the proposed features. In the next section, I'll describe the features more carefully and explain precisely how they'd work.

Details

I propose to extend the go statement to return a function, which takes no arguments and its return type is the same as the return type of the function called in the go statement, including multiple return values. Secondly, I propose to extend the select statement with an optional cancelling branch.

For example:

var (
    f1 func() float64           = go math.Sqrt(100)
    f2 func() (*os.File, error) = go os.Open("file.txt")
    f3 func() int               = go rand.Intn(20)
)

Calling the function returned from the go statement suspends until the spawned goroutine returns, then it returns exactly what the spawned function returned. Calling the returned function multiple times causes nothing additional and always returns the same results.

Calling the functions assigned above results in this:

fmt.Println(f1()) // 10
fmt.Println(f2()) // &{0xc4200920f0} <nil>
fmt.Println(f3()) // 17

// we can call them as many times as we want
fmt.Println(f3(), f3(), f3()) // 17 17 17

Furthermore, calling the returned function causes the spawned goroutine to start a cancellation process. The cancellation process has two stages:

  1. Cancelling all child goroutines (goroutines spawned inside the goroutine that is being cancelled). This stage finishes when all child goroutines finish cancellation.
  2. Switching into the cancelling mode. In this mode, all select statements always select the cancelling branch if present. If not present, they function normally.

Eventually the goroutine returns. The call to the function returned from the go statement unsuspends and returns these values.

Other uses of the mechanism

The mechanism can also be used for other purposes. One that comes to my mind is to use the functions returned from the go statement as futures. Indeed, this is a common pattern in Go:

ch := make(chan T)
go func() {
    ch <- function()
}()
// some other code
x := <-ch

This whole boilerplate is here just to execute function concurrently and use its return value later. With my proposal, we could simplify that code like this:

future := go function()
// some other code
x := future()

Of course, this would only work if function wouldn't support cancellation, but most functions shouldn't support it, and those that do should document it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment