Skip to content

Instantly share code, notes, and snippets.

@jeremyong
Last active May 21, 2021 20:23
Show Gist options
  • Save jeremyong/6f8c37ef130babf862c8ac5cd444125a to your computer and use it in GitHub Desktop.
Save jeremyong/6f8c37ef130babf862c8ac5cd444125a to your computer and use it in GitHub Desktop.
Crowdsourcing review of lock free initialization/deinitialization routine
// Suppose we have two functions load() and unload() that take a finite amount of time to complete and are not threadsafe.
// They also cannot run at the same time. It is invalid to call load before a pending unload is finished, and it is invalid
// to call unload before a previous call to load has finished. Basically, we are either in a "fully loaded" state, or a
// "fully unloaded" state.
void load();
void unload();
// Furthermore, suppose we have two functions, start() and finish(). The first time we call start, load must
// run to completion before entering the body of start(). After the final call to finish, unload should be called unless
// another thread happens to call load early.
void start();
void finish();
// The question is, how can and implement the prologue of the start function and the epilogue of the finish function to
// perform load and unload properly in a lockfree manner? Here's an initial stab at the solution. Can we prove it's correct?
#include <atomic>
#include <thread>
// Bit layout of the init state atomic
struct load_state_t
{
uint32_t unloading : 1;
uint32_t loaded : 1;
uint32_t ref_count : 30;
};
static std::atomic<uint32_t> load_state = 0;
bool loaded(uint32_t state) { return (state & (1ul << 30)) != 0; }
bool unloading(uint32_t state) { return (state & (1ul << 31)) != 0; }
uint32_t ref_count(uint32_t state) { return (state & ((1ul << 30) - 1)); }
void start() {
uint32_t prior = load_state++;
if (ref_count(prior) == 0) {
// We're responsible for ensuring we're in a "loaded" state
// First, spin until we are no longer unloading if we are
while (unloading(prior)) {
// yield to let threads needed in the unload function to make progress
std::this_thread::yield();
prior = load_state.load(std::memory_order_acquire);
}
// If loaded is false, proceed with loading. The loaded bit could still be set because the finish function may not have called "unload"
// if the ref count incremented immediately after the finish function decremented the ref count to 0.
if (!loaded(prior))
{
load();
// Atomically set the loaded state (preserving current ref count)
// NOTE: notice it's impossible for the ref count here to dip to 0 since this function's caller is still in the body of "start"
load_state |= (1 << 30);
}
} else {
// The prior ref count was > 0 so we just need to wait until the state is loaded
while (!loaded(prior)) {
// yield to let threads needed in the load function to make progress
std::this_thread::yield();
prior = load_state.load(std::memory_order_acquire);
}
}
// Rest of the start function
}
void finish() {
// Body of the finish function
uint32_t prior = load_state--;
if (ref_count(prior) == 1) {
// We're responsible for unloading IFF another thread didn't increment the ref count in the meantime
// Attempt ONCE to set the load_state to the unloading state with a ref count of zero. Failure to do
// so means someone else called start in the meantime
if (load_state.compare_exchange_strong(prior - 1, (1 << 31))) {
unload();
// Unset the unloading state
load_state &= ~(1 << 31);
}
}
}
@jeremyong
Copy link
Author

I believe for itanium or non-memory coherent architectures, lines 57 and 84 can be replaced with std::memory_order_release-ordered fetch_or and fetch_and respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment