Understanding Communicating Sequential Processes, (Directed) Acyclic Graphs, and the select Operator
After much discussion with Roger Peppe (@rogpeppe) following my speech at gophercon, I thought it might be best to clarify my thoughts on and understanding of how to avoid deadlocks in go (using CSP, DAG's and select).
Damian Gryski (@dgryski) provided a link to a nice summary of the conditions necessary for a deadlock:
- mutual exclusion
- hold and wait or partial allocation
- no pre-emption
- resource waiting or circular wait
We are going to use the three rules below to prevent condition 4, thus removing the potential for deadlock.
First I'm going to provide some conditions which assure a program can not deadlock:
- You must read from your inputs
- The connection of your processes forms an acyclic graph.
Hoare discusses this in the case of livelock :
A simple method to prove (P>>Q ) is free of livelock is to show that P is left-guarded in the sense that it can never output an infinite series of messages to the right without interspersing inputs from the left.
That is if we only ever send, and never stop to receive, any processes sending to us will be forever blocked. We are forcing upstream processes to wait on a resource which will never become available. Any sending processes will be in deadlock, and the entire system will be in livelock.
// neverRead is an example of a process in CSP which can
// produce a partial deadlock.
func neverRead(input <-chan int, output chan<- int) {
<-input
for {
output <- 1
}
}
Avoiding this is pretty straightforward: don't send and never recieve on your inputs. This is a pretty rare problem and possibly a non issue in your system.
When two concurrent processes communicate with each other by output and input only on a single channel, they cannot deadlock (compare 2.7 L2). As a result, any network of nonstopping processes which is free of cycles cannot deadlock, since an acyclic graph can be decomposed into subgraphs connected only by a single arrow. (Hoare, 125)
This is necessary to eliminate the circular wait problem. If a processes a is sending to another b and vice versa a deadlock occurs. In Hoare's notation that is (a->b||b->a)
.
// circularWait demonstrates a trivial circular wait, where two
// goroutines both attempt to send to each other and no progress can be made.
func circularWait() {
c1 := make(chan int)
c2 := make(chan int)
go func() { c1 <- 2; fmt.Println(<-c2) }()
c2 <- 1
fmt.Println(<-c1)
}
The solution to this is in the title, if the communication of goroutines forms no cycles, no circular wait is possible.
The connection of our goroutines forms a directed graph. However, undirected cycles can still cause a deadlock. Consider the function below.
// orderdWait demonstrates a deadlock wait caused by ordering, in this case
// a goroutine sends on c2 while the other waits on c1. If we where
// acquiring locks, a similar thing might occur when locks are acquired out
// of order.
func orderdWait() {
c1 := make(chan int)
c2 := make(chan int)
go func() { c2 <- 2; c1 <- 1 }()
fmt.Println(<-c1)
fmt.Println(<-c2)
}
The above example itself is an example of processes forming an undirected cycle. Undirected cycles (as demonstrated above) can deadlock. The connection diagram formed is this (notice this is a DAG, but not an AG):
Don't worry though, we can break these directed cycles in three ways:
- Using the select operator
- Using a common output channel
- Using a fan in strategy
If we have multiple inputs and use select we can effectively merge them. If we select against all inputs we satisfy Hoare's note that output and input must be only on a single channel. Consider this code snippet:
// This is an example of a program which uses select to break an
// undirected cycle
func main() {
c1 := make(chan int)
c2 := make(chan int)
c3 := make(chan int)
// Source our inputs
go func() { c2 <- 2; c1 <- 1; close(c1) }()
// Do some operation on multiple inputs
go func() {
for {
select {
case x, ok := <-c1:
if !ok {
close(c3)
return
}
c3 <- x * 2
case x := <-c2:
c3 <- x * 3
}
}
}()
// Sink our outputs
for x := range c3 {
fmt.Println(x)
}
}
This codes connection diagram looks like this:
Another option is just to make sure any sending goroutines "communicate with each other by output and input only on a single channel". If your producers API lets you specify the output channel this is pretty straightforward:
// This is an example of a program which uses a common input
// channel to enforce lack of cycles.
func main() {
c1 := make(chan int)
c2 := make(chan int)
// Source our inputs
go func() { c1 <- 1 }()
go func() { c1 <- 1; close(c1) }()
// Do some operation on our inputs
go func() {
for {
x, ok := <-c1
if !ok {
close(c2)
return
}
c2 <- x * 2
}
}()
// Sink our outputs
for x := range c2 {
fmt.Println(x)
}
}
The connection diagram of this program looks like this:
If our API doesn't allow us to select an output channel, we can use "mergers" and a fan in strategy to produce the desired effect. In this case we start a separate goroutine to receive the output of each channel and repeat it onto a common channel:
func merger(input <-chan int, output chan<- int) {
for x := range input {
output <- x
}
}
// This is an example of a program which uses a fan in strategy
func main() {
c1 := make(chan int)
c2 := make(chan int)
c3 := make(chan int)
c4 := make(chan int)
wg := &sync.WaitGroup{}
// Source our inputs
go func() { c2 <- 2; c1 <- 1; close(c1); close(c2) }()
// Mergers for each input
go func() { wg.Add(1); merger(c1, c3); wg.Done() }()
go func() { wg.Add(1); merger(c2, c3); wg.Done() }()
// Closer for c3
go func() { wg.Wait(); close(c3) }()
// Do some operation on our inputs
go func() {
for x := range c3 {
c4 <- x * 2
}
close(c4)
}()
// Sink our outputs
for x := range c4 {
fmt.Println(x)
}
}
The connection diagram of that looks something like this:
So what strategy should we use? Merging inputs onto a single channel is the best way to do things from a performance perspective if our input types are common. However if our input types are disparate judicious use of select and reflect.Select makes code simpler.
And those two constructs make our programs form acylic graphs, when they would otherwise contain undirected cycles. And thus they remove deadlocks. So if you can try not to do this:
// Avoid the devil and he will flee
aval := <-a
bval := <-b
And if you can't avoid requiring order on your inputs document it in the function header! That way upstream producers know they must generate on separate goroutines or at least respect order.
Superspecial thanks to @rogpeppe for help and comments along the way with this. Also take a look at his awesome use of mergers in this CrossProduct example. Also thanks to Damian Gryski for the dead lock info above.
The (probably) abundant grammar and spelling mistakes are all my own.
Two things:
your definition of livelock doesn't seem right to me. AFAIK (and this seems to be backed by your Hoare reference), livelock is when two parties communicate continually without any progress being made by the system. That's not what you describe here.
Your example http://play.golang.org/p/tQfKDlKPEl is not sufficient by your own criteria because it does not always select on all inputs. When it has read from one input, it stops selecting on all inputs and only selects on the remaining ones. This is sufficient to cause deadlock in some circumstances when the inputs originates from the same goroutine.
I would be inclined to remove axiom 3 and make axiom 2 stronger. Something like:
Note the absence of the word "directed" there.