Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Rough script to extract images from HTTP Archive (HAR) files
const fs = require('fs');
const file = JSON.parse(fs.readFileSync('./dump.har')).log;
const targetMimeType = 'image/jpeg';
let count = 1;
for (const entry of file.entries) {
if (entry.response.content.mimeType === targetMimeType) {
// ensure output directory exists before running!
fs.writeFileSync(`output/${count}.png`, new Buffer(entry.response.content.text, 'base64'), 'binary');
count++;
}
}
console.log(`Grabbed ${count} files`);
@ntcho

This comment has been minimized.

Copy link

@ntcho ntcho commented Jun 23, 2020

Thanks for the script.

But it ran into an error saying TypeError: First argument must be a string, Buffer, ArrayBuffer, Array, or array-like object. from Buffer. So I edited the script to make it work. Since 'base64' is a valid type for writeFileSync (from this SO answer), we can just use 'base64' without creating Buffer object.

const fs = require('fs');
const file = JSON.parse(fs.readFileSync('./dump.har')).log;
const targetMimeType = 'image/jpeg';

let count = 0;
for (const entry of file.entries) {
  if (entry.response.content.mimeType === targetMimeType) {
    // ensure output directory exists before running!
    fs.writeFileSync(`output/${count}.png`, entry.response.content.text, 'base64', function(err) {
      console.log(err);
    });
    count++;
  }
}

console.log(`Grabbed ${count} files`);

And to execute, I've ran the following in the PowerShell console.

node .\har-extract.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment