domenic/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Generic zero-copy ArrayBuffer usage

Most APIs which accept binary data need to ensure that the data is not modified while they read from it. (Without loss of generality, let's only analyze ArrayBuffer instances for now.) Modifications can come about due to the API processing the data asynchronously, or due to the API processing the data on some other thread which runs in parallel to the main thread. (E.g., an OS API which reads from the provided ArrayBuffer and writes it to a file.)
On the web platform, APIs generally solve this by immediately making a copy of the incoming data. The code is essentially:
function someAPI(arrayBuffer) {
  arrayBuffer = arrayBuffer.slice(); // make a copy

  // Now we can use arrayBuffer, async or in another thread,
  // with a guarantee nobody will modify its contents.
}
But this is slower than it could be, and uses twice as much memory. Can we do a zero-copy version?
One solution is for such APIs to transfer the input:
function someAPI(arrayBuffer) {
  arrayBuffer = arrayBuffer.transfer(); // take ownership of the backing memory

  // Now we can use arrayBuffer, async or in another thread,
  // with a guarantee nobody will modify its contents.
}
But this can be frustrating for callers, who don't know which APIs will do this, and thus don't know whether passing in an ArrayBuffer to an API will give up their own ownership of it.
This gist explores a solution which has the following properties:

It requires the caller to do a one-time transfer of the ArrayBuffer to the callee, via explicit call-site opt-in.
Callees do need to do a small amount of work to take advantage of this, but the code to do that work is generic and could be generated automatically. (E.g. by Web IDL bindings, on the web.)

In this world, the default path, where you just call someAPI(arrayBuffer), still does a copy. This means the caller doesn't have to worry about whether they're allowed to continue using arrayBuffer or not. I think this is the right default given how the ecosystem has grown so far.
What it looks like in practice

function someAPI(arrayBuffer) {
  // This line could be code-generated generically for all ArrayBuffer-taking APIs.
  arrayBuffer = ArrayBufferTaker.takeOrCopy(arrayBuffer);

  // Nobody else can modify arrayBuffer. Do stuff with it, possibly asynchronously
  // or in native code that reads from it in other threads.
}

const arrayBuffer = new ArrayBuffer(1024);
someAPI(arrayBuffer); // copies

const arrayBuffer2 = new ArrayBuffer(1024);
someAPI(new ArrayBufferTaker(arrayBuffer2)); // transfers
The implementation of ArrayBufferTaker can be done today, and is in the attached file.
Open questions


How to make this work ergonomically for cases where someAPI takes a typed array or DataView?
Probably arrayBuffer2.take() or some better-named method would be more ergonomic than new ArrayBufferTaker(arrayBuffer2)
Probably in general we should come up with better names. This is an important paradigm and using the right names and analogies is key.
Can we let someAPI release the memory back to the caller? That would require language support.
How does this interact with SharedArrayBuffers, resizable ArrayBuffers, and growable SharedArrayBuffers?

Probably this is just not applicable to SharedArrayBuffer cases. Those are explicitly racey.
Maybe it just works for resizable ArrayBuffers?


Acknowledgments

Thanks to @jasnell for inspiring this line of thought via whatwg/fetch#1560. Thanks to the members of the "TC39 General" Matrix channel for a conversation that spawned this idea, especially @mhofman who provided the key insight: a two-step create-taker then take procedure, instead of attempting to do this in one step.

  
## zero-copy.mjs
class ArrayBufferTaker {
  #ab;

  constructor(ab) {
    // Using https://github.com/tc39/proposal-arraybuffer-transfer
    this.#ab = ab.transfer();

    // Or if you want something that works today:
    // this.#ab = structuredClone(ab, { transfer: [ab] });
  }

  take() {
    const ab = this.#ab;
    if (!ab) {
      throw new TypeError("Cannot take twice");
    }
    this.#ab = null;
    return ab;
  }

  static takeOrCopy(abOrTaker) {
    if (#ab in abOrTaker) {
      return abOrTaker.take();
    }
    return abOrTaker.slice();
  }
}
	class ArrayBufferTaker {
	#ab;

	constructor(ab) {
	// Using https://github.com/tc39/proposal-arraybuffer-transfer
	this.#ab = ab.transfer();

	// Or if you want something that works today:
	// this.#ab = structuredClone(ab, { transfer: [ab] });
	}

	take() {
	const ab = this.#ab;
	if (!ab) {
	throw new TypeError("Cannot take twice");
	}
	this.#ab = null;
	return ab;
	}

	static takeOrCopy(abOrTaker) {
	if (#ab in abOrTaker) {
	return abOrTaker.take();
	}
	return abOrTaker.slice();
	}
	}