Skip to content

Instantly share code, notes, and snippets.

@nestarz
Last active December 5, 2023 19:35
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nestarz/1fa7ae93fb83f1eafb1b88c3a84f2e02 to your computer and use it in GitHub Desktop.
Save nestarz/1fa7ae93fb83f1eafb1b88c3a84f2e02 to your computer and use it in GitHub Desktop.
NDJSON File Streaming using browser Stream API + Loading Progress
const parseJSON = () =>
new TransformStream({
transform(chunk, controller) {
controller.enqueue(JSON.parse(chunk));
}
});
const splitStream = splitOn => {
let buffer = "";
return new TransformStream({
transform(chunk, controller) {
buffer += chunk;
const parts = buffer.split(splitOn);
parts.slice(0, -1).forEach(part => controller.enqueue(part));
buffer = parts[parts.length - 1];
},
flush(controller) {
if (buffer) controller.enqueue(buffer);
}
});
};
const fetchJSONLD = url =>
fetch(url).then(response => ({
response,
reader: response.body
.pipeThrough(new TextDecoderStream())
// Needed to stream by line and then JSON parse the line
.pipeThrough(splitStream("\n"))
.pipeThrough(parseJSON())
.getReader()
}));
export { fetchJSONLD };
(async () => {
const { fetchNDJSON } = await import("/fetchNDJSON.js");
const results = [];
let progress = 0;
fetchJSONLD(file).then(({ response, reader }) => {
const length = response.headers.get("Content-Length");
let received = 0;
const onReadChunk = async chunk => {
if (chunk.done) {
return;
}
// await timeout(1000);
received += chunk.value.length;
results.push(chunk.value);
progress = received / length;
reader.read().then(onReadChunk);
};
reader.read().then(onReadChunk);
});
})();
@slevy85
Copy link

slevy85 commented Sep 5, 2022

Note: This does not work in Firefox because TextDecoderStream is not supported by firefox https://developer.mozilla.org/en-US/docs/Web/API/TextDecoderStream#browser_compatibility

@nestarz
Copy link
Author

nestarz commented Sep 5, 2022

@slevy85
Copy link

slevy85 commented Sep 5, 2022

Here is a version using TextDecoder only :

const splitOn = "\n";

const newNdJsonStream = () => {
  let buffer = "";
  const decoder = new TextDecoder();

  return new TransformStream<Uint8Array, object>({
    async transform(chunk, controller) {
      chunk = await chunk;

      // we use `stream: true` in case the chunk ends at the middle of a character
      const string = decoder.decode(chunk, { stream: true });

      buffer += string;

      const parts = buffer.split(splitOn);

      parts
        .slice(0, -1)
        .forEach((part) => controller.enqueue(JSON.parse(part) as object));

      // Saving the last line because it may not be ended yet
      buffer = parts[parts.length - 1];
    },

    flush(controller) {
      if (buffer) controller.enqueue(JSON.parse(buffer) as object);
    },
  });
};

export const fetchNdJson = (url: string) =>
  fetch(url).then((response) => ({
    response,
    reader: response.body.pipeThrough(newNdJsonStream()).getReader(),
  }));

@slevy85
Copy link

slevy85 commented Sep 5, 2022

Overall thank you very much for this @nestarz , it is very useful !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment