Skip to content

Instantly share code, notes, and snippets.

@faiface
Created October 2, 2018 11:10
Show Gist options
  • Save faiface/4a1804d9dbe0a5241cec12031b5e9aeb to your computer and use it in GitHub Desktop.
Save faiface/4a1804d9dbe0a5241cec12031b5e9aeb to your computer and use it in GitHub Desktop.

Problem: the "context" package

More than a year ago, I wrote a blog post titled Context Should Go Away For Go 2 which received a fair amount of support and response. In said blog post, I described reasons why the "context" package is a bad idea because it's too infectious.

As explained in the blog post, the reason why "context" spreads so much and in such an unhealthy fashion is because it solves the problem of cancellation of long-running procedures.

I promised to follow the blog post (which only complained about the problem) with a solution. Considering the recent progress around Go 2, I decided it's the right time to do the follow up now. So, here it is!

Solution: bake the cancellation into Go 2

My proposed solution is to bake cancellation into the language and thus avoiding the need to pass the context around just to be able to cancel long-running procedures. The "context" package could still be kept for the purpose of goroutine-local data, however, this purpose does not cause it to spread, so that's fine.

In the following sections I'll explain how exactly the baked-in cancellation would work.

One quick point before we start: this proposal does not make it possible to "kill" a goroutine - the cancellation is always cooperative.

Examples to get the idea

I'll explain the proposal in a series of short, very contrived examples.

We start a goroutine:

go longRunningThing()

In Go 1, the go keyword is used to start a goroutine, but doesn't return anything. I propose it should return a function which when called, cancels the spawned goroutine.

cancel := go longRunningThing()
cancel()

We started a goroutine and then cancelled it immediately.

Now, as I've said, cancellation must be a cooperative operation. The longRunningThing function needs to realize its own cancellation on request. How could it look like?

func longRunningThing() {
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    }
}

This longRunningThing function does not cooperate. It takes 5 seconds no matter what. Here's how we can improve it:

func longRunningThing() {
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    cancelling:
        fmt.Println("cancelled")
    }
}

I propose the select statement gets an additional branch called cancelling (a new keyword) which gets triggered whenever the goroutine is scheduled for cancellation, i.e. when the function returned from the go statement gets called.

The above program would therefore print:

cancelled

What if the long-running thing spawns some goroutines itself? Does it have to handle their cancellation explicitly? No, it doesn't. All goroutines spawned inside a cancelled goroutine get cancelled first and the originally cancelled goroutine starts its cancellation only after all its 'child' goroutines finish.

For example:

func longRunningThing() {
    go anotherLongRunningThing()
    select {
    case <-time.After(5 * time.Second):
        fmt.Println("finished")
    cancelling:
        fmt.Println("cancelled")
    }
}

func anotherLongRunningThing() {
    select {
    case <-time.After(3 * time.Second):
        fmt.Println("child finished")
    cancelling:
        fmt.Println("child cancelled")
    }
}

This time, running:

cancel := go longRunningThing()
cancel()

prints out:

child cancelled
cancelled

This features is here because the child goroutines usually communicate with the parent goroutine. It's good for the parent goroutine to stay fully intact until the child goroutines finish.

Now, let's say, that instead of in another goroutine, longRunningThing needs to execute anotherLongRunningThing three times sequentially, like this (anotherLongRunningThing remains unchanged):

func longRunningThing() {
    anotherLongRunningThing()
    anotherLongRunningThing()
    anotherLongRunningThing()
}

This time, longRunningThing doesn't even handle the cancellation at all. But, cancellation propagates to all nested calls. Cancelling this longRunningThing would print:

child cancelled
child cancelled
child cancelled

All anotherLongRunningThing calls got cancelled one by one.

What if anotherLongRunningThing can fail, or just wants to signal it was cancelled instead of finishing successfully? We can make it return an error:

func anotherLongRunningThing() error {
    select {
    case <-time.After(3 * time.Second):
        return nil
    cancelling:
        return errors.New("cancelled")
    }
}

Now we update the longRunningThing to handle the error (using the new error handling proposal):

func longRunningThing() error {
    check anotherLongRunningThing()
    check anotherLongRunningThing()
    check anotherLongRunningThing()
    return nil
}

In this version, longRunningThing returns the first error it encounters while executing anotherLongRunningThing three times sequentially. But how do we receive the error? We spawned the function in a goroutine and there's no way to get the return value of a goroutine in Go 1.

Here comes the last thing I propose. I propose that the function returned from the go statement has the same return values as the function that was set to run in the goroutine. So, in our case, the cancel function has type func() error:

cancel := go longRunningThing()
err := cancel()
fmt.Println(err)

This prints:

cancelled

However, if we waited 10 seconds before cancelling the goroutine (longRunningThing takes 9 seconds), we'd get no error, because the function finished successfully:

cancel := go longRunningThing()
time.Sleep(10 * time.Second)
err := cancel()
fmt.Println(err)

Prints out:

<nil>

And lastly, say we have a function called getGoods which contacts some service, gets some goods back, and sends them on a channel. We only want to wait for the goods for 5 seconds, no more. Here's how we implement a timeout:

goods := make(chan Good)
cancel := go getGoods(goods)

select {
case x := <-goods:
    // enjoy the goods
case <-time.After(5 * time.Second):
    err := cancel()
    return errors.Wrap(err, "timed out")
}

And that is the end of this series of short examples. I've shown all of the proposed features. In the next section, I'll describe the features more carefully and explain precisely how they'd work.

Details

Other uses of the mechanism

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment