Skip to content

Instantly share code, notes, and snippets.

@zgotsch
Created September 11, 2023 20:04
Show Gist options
  • Save zgotsch/ad9ed94cbfa9062318d7b4a5d2682ab6 to your computer and use it in GitHub Desktop.
Save zgotsch/ad9ed94cbfa9062318d7b4a5d2682ab6 to your computer and use it in GitHub Desktop.
User-space serializable sequential computations in JS/TS

I was thinking some today about long-running computations on function-as-a-service (hereafter FaaS) platforms. FaaS platforms have relatively strict runtime constraints (e.g. 10s on Vercel's unpaid plan, but up to 15m on AWS Lambda, with others usually falling somewhere in-between). When doing computations that are either interruptable (a sequence of less expensive operations) or take place remotely (e.g. expensive API calls to something like an AI service), the ability to suspend and resume these long-running/blocking computations may make it possible to run them on FaaS platforms.

A quick web search told me what I was looking for is called "serializable first-class continuations" and, unfortunately, JS doesn't have support for first-class continuations, much less serializable ones. However, since I was primarily interested in interrupting and serializing computations at await-points, I thought I might be able to get somewhere by leveraging generators. Unfortunately generators aren't serializable either, but functions are (mostly! The source can be serialized but the closure will be lost). So here's what I came up with:

function anExpensiveComputation(): Promise<number> {
  return new Promise((resolve) => setTimeout(() => resolve(42), 1000));
}

function anotherExpensiveComputation(input: number): Promise<number> {
  return new Promise((resolve) => setTimeout(() => resolve(input + 1), 1000));
}

function* sequencedComputation(): Generator<() => Promise<number>, number, number> {
  const a = yield () => anExpensiveComputation();
  const b = yield () => anotherExpensiveComputation(a);
  return b;
}

// Now we can create the computation and run it to completion
let snapshotable = SnapshotableComputation.fromGeneratorFn(sequencedComputation);
let result = await snapshotable.run();
console.log(result); // 43

// Or we can run it to a certain point, snapshot it, and then resume it later
snapshotable = SnapshotableComputation.fromGeneratorFn(sequencedComputation);
const resultP = snapshotable.run();
await delay(1500);
const snapshot = snapshotable.snapshot();
persist(snapshot);

// ... later

// the first expensive computation has already finished, it will not be run again
snapshotable = SnapshotableComputation.deserialize(snapshot);
result = await snapshotable.run();
console.log(result); // 43

This approach has a few limitations:

  1. The closure for the generator function is lost when the snapshot of the computation is created. This means that all referenced functions in the global scope. As far as I know, this limitation can't be overcome without codegen (like https://github.com/nokia/ts-serialize-closures).
  2. Values returned by the computations must also be serializable. This means you can't get resources which have some external state, such as file handles. (This is inevitable and desireable, since those resources shouldn't live long enough to be available to a resumed computation.)

Additionally, for the usecase I described (running computations on FaaS), you'd need some additional machinery: a way to uniquely identify computations, and a runtime which can dispatch the result of external async computations either to already running or suspended computations which are awaiting that result.

The serializability and closure loss limitations might preclude this approach's usefulness, but it might be possible to find workarounds or bounded applications. Further ideas for exploration/improvement mostly focus around codegen/macros, adding idempotent, retriable operations (to enable external resource acquisition in an idempotent way), and thinking about what it would take to get serializable generators or closures in JS runtime environments.

Further reading:

export default class SnapshotableComputation<TArgs extends unknown[], TState, TReturn> {
private readonly generatorFn: (
...args: TArgs
) => Generator<() => Promise<TState>, TReturn, TState>;
private readonly results: TState[];
private halted: boolean = false;
public static readonly HALTED: unique symbol = Symbol("HALTED");
private constructor(
generatorFn: (...args: TArgs) => Generator<() => Promise<TState>, TReturn, TState>,
results: ReadonlyArray<TState>
) {
this.generatorFn = generatorFn;
this.results = [...results];
}
static fromGeneratorFn<TArgs extends unknown[], TState, TReturn>(
generatorFn: (...args: TArgs) => Generator<() => Promise<TState>, TReturn, TState>
): SnapshotableComputation<TArgs, TState, TReturn> {
return new SnapshotableComputation(generatorFn, []);
}
static deserialize<TArgs extends unknown[], TState, TReturn>(
serialized: string
): SnapshotableComputation<TArgs, TState, TReturn> {
const {generatorFn: generatorFnSource, results} = JSON.parse(serialized);
const generatorFn = eval(`(${generatorFnSource})`);
return new SnapshotableComputation(generatorFn, results);
}
snapshot(): string {
return JSON.stringify({
generatorFn: this.generatorFn.toString(),
results: this.results,
});
}
halt(): void {
this.halted = true;
}
async run(...args: TArgs): Promise<TReturn> {
const generator = this.generatorFn(...args);
let result = generator.next();
for (const input of this.results) {
invariant(!result.done, "generator should not be done while there are still inputs");
result = generator.next(input);
}
while (!result.done) {
const thunk = result.value;
const value = await thunk();
if (this.halted) {
return Promise.reject(SnapshotableComputation.HALTED);
}
this.results.push(value);
result = generator.next(value);
}
return result.value;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment