How can one understand and predict the performance of multicore algorithms?
As a first step, I'd recommend acquiring a basic understanding of the MSI protocol. Modern systems use somewhat more complex protocols, but a basic understanding of MSI should be enough to predict the most important phenomena.
Below is an even further simplified model of shared memory based on the MSI protocol. While the model is not precise, and makes several simplifications not based on actual hardware, it should be accurate enough to predict many