dherman/shared-array-buffer.md

## shared-array-buffer.md

      
    Raw
  

              shared-array-buffer.md
            
          
    Goal

Typed arrays can be copied or transferred between workers, but it's not possible for multiple workers to work with a buffer in parallel without copies. This document describes two ways to improve this without introducing data races: transferring read-write access to disjoint regions of buffers, and transferring read-only access to shared buffers/regions.
This variant of the API enables fine-grained borrowing and sharing, where a single ArrayBuffer can have multiple disjoint regions parceled out. This way individual workers can work with their regions at their original indices. This makes the API more amenable to being a compilation target.
Example

Here is an example that demonstrates sharing a read-only segment and multiple read-write segments with four separate workers.
Main thread

The main thread allocates the shared buffer, splits out some regions, and shares them with different workers.
// will be divided into a shared half and four writable regions
var buffer = new SharedArrayBuffer(32768);

var workers = Array.build(4, function() {
  return new Worker('work.js');
});

// borrowed regions are single-ownership
var borrowed = [buffer.borrow(16384, 20480),
                buffer.borrow(20480, 24576),
                buffer.borrow(24576, 28672),
                buffer.borrow(28672, 32768)];

workers.forEach(function(worker, i) {
  // request another read-only access token
  var readOnly = buffer.freeze(0, 16384);

  // copy the buffer (which is atomically refcounted) and transfer the access tokens
  worker.postMessage([buffer, readOnly, borrowed[i]], [buffer, readOnly]);

  try {
    // error: can't use transferred access token
    readOnly.attach(buffer);
  } catch (e) { }
});

workers.forEach(function(worker) {
  worker.onmessage = function(event) {
    // receive the access tokens sent back from the worker
    var [readOnly, borrowed] = event.data;

    // absorb the region tokens back into the owning buffer
    buffer.release(readOnly);
    buffer.release(borrowed);

    try {
      // error: can't use released access token
      readOnly.attach(buffer);
    } catch (e) { }
  };
});
Workers

Each worker receives the buffer and some region access tokens, attaches those regions to the buffer, and does some work. When it's done, it detaches the region access tokens and returns them to the main thread.
function doStuffWith(buffer) {
  // read from read-only range
  // read/write to borrowed range
  // ...
}

self.onmessage = function(event) {
  var [buffer, readOnly, borrowed] = event.data;

  try {
    // error: can't borrow from range that is not currently owned
    buffer.borrow(0, 4096);
  } catch (e) { }

  try {
    // attach the regions to the buffer
    borrowed.attach(buffer);
    readOnly.attach(buffer);

    doStuffWith(buffer);
  } finally {
    // detach the regions from the buffer
    borrowed.detach();
    readOnly.detach();

    // return the regions back to the main thread
    postMessage([readOnly, borrowed], [readOnly, borrowed]);
  }
};
Design Overview

Some of the high points:

Every byte cell in a shared buffer has either read-write access, read-only access, or no access.
Borrowing a range removes access from the buffer and instills that access in a separate Region object.
Sharing a range shares read-only access from the buffer with a separate Region object.
"Attaching" a Region object bestows its access onto the target buffer.
"Detaching" a Region object revokes its access from the target buffer.
Region objects must be communicated by transfer so that their access cannot be duplicated.

Recursive sub-division

As long as a buffer currently has access to a range, it can borrow/share regions in a sub-range. As long as it has outstanding regions checked out, it can't relinquish its ownership of those regions.
This means that it's possible to do recursive subdivision simply by checking out sub-regions of a buffer and sending them to subsequent workers.
Other Considerations


Regions should be restricted to being allocated in sizes and on boundaries of some reasonably conservative multiple; probably at least 4KB. We need to figure out what a good number is here.
The separate SharedArrayBuffer type allows for the additional methods not to pollute non-shared buffers, and allows for a different performance model and implementation strategy than sequential buffers. However, it's conceivable that we could make regular ArrayBuffers shareable.
Need a possibly better name than "region."