jhaberstro/compiler_optimize_atomics.md

## compiler_optimize_atomics.md

      
    Raw
  

              compiler_optimize_atomics.md
            
          
    Showcases some interesting and non-obvious optimizations that compilers can make on and around atomics. In particular, I liked this example: the following code
int x = 0;
std::atomic<int> y;
int dso() {
  x = 0;
  int z = y.load(std::memory_order_seq_cst);
  y.store(0, std::memory_order_seq_cst);
  x = 1;
  return z;
}

can be optimized to the following:
int x = 0;
std::atomic<int> y;
int dso() {
  // Dead store eliminated.
  int z = y.load(std::memory_order_seq_cst);
  y.store(0, std::memory_order_seq_cst);
  x = 1;
  return z;
}

The first store to x can be dead-code eliminated because the only way to another thread could observe it is if that thread was race-y. That is, if the other thread read y and saw that it was 0, it would have no way of knowing if and when the subsequent store of 1 to x has occurred yet, thus it would also have no way of knowing if the initial store of 0 to x has been over-written. Thus, there is no way to reason about the state of x, thus the first store can be eliminated since it will eventually be overwritten.