Skip to content

Instantly share code, notes, and snippets.

@jhaberstro
Created May 22, 2016 19:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jhaberstro/a8e7a39ea33a19cf6e767500bbcc9768 to your computer and use it in GitHub Desktop.
Save jhaberstro/a8e7a39ea33a19cf6e767500bbcc9768 to your computer and use it in GitHub Desktop.
"N4455 No Sane Compiler Would Optimize Atomics" notes

Showcases some interesting and non-obvious optimizations that compilers can make on and around atomics. In particular, I liked this example: the following code

int x = 0;
std::atomic<int> y;
int dso() {
  x = 0;
  int z = y.load(std::memory_order_seq_cst);
  y.store(0, std::memory_order_seq_cst);
  x = 1;
  return z;
}

can be optimized to the following:

int x = 0;
std::atomic<int> y;
int dso() {
  // Dead store eliminated.
  int z = y.load(std::memory_order_seq_cst);
  y.store(0, std::memory_order_seq_cst);
  x = 1;
  return z;
}

The first store to x can be dead-code eliminated because the only way to another thread could observe it is if that thread was race-y. That is, if the other thread read y and saw that it was 0, it would have no way of knowing if and when the subsequent store of 1 to x has occurred yet, thus it would also have no way of knowing if the initial store of 0 to x has been over-written. Thus, there is no way to reason about the state of x, thus the first store can be eliminated since it will eventually be overwritten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment