Skip to content

Instantly share code, notes, and snippets.

@nathansmith
Last active April 23, 2023 14:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nathansmith/5f11c8d8d0bbd1f9b19be91b01cd40a9 to your computer and use it in GitHub Desktop.
Save nathansmith/5f11c8d8d0bbd1f9b19be91b01cd40a9 to your computer and use it in GitHub Desktop.
Node script to extract unique domains from a `*.har` file.

How to use

To use this Node script:

  1. Save the contents of the script in a file named har-domain-parser.cjs.

  2. Place your trace.har file in the same directory as the script file.

  3. Then run the script using the command line.

node har-domain-parser.cjs

When it has finished running, it will create another file in the same directory named domains.txt.

const { join } = require('path');
const { readFileSync, writeFileSync } = require('fs');
// ==========
// Constants.
// ==========
const FILE_NAME_TEXT = 'domains.txt';
const FILE_NAME_TRACE = 'trace.har';
// =========
// Encoding.
// =========
const encoding = 'utf8';
const options = { encoding };
// ===============
// Get file paths.
// ===============
const pathForFileInput = join(__dirname, FILE_NAME_TRACE);
const pathForFileOutput = join(__dirname, FILE_NAME_TEXT);
// ===============
// Get input text.
// ===============
const textForFileInput = readFileSync(pathForFileInput, options);
const jsonForFileInput = JSON.parse(textForFileInput);
// ===================
// Helper: add domain.
// ===================
const domainSet = new Set();
const addDomain = (value = '') => {
// Get domain.
let domain = value.split('://')[1] || '';
domain = domain.split('/')[0] || '';
// Domain exists?
if (domain) {
// Add to set.
domainSet.add(domain);
}
};
// ================================================
// Helper: recursively parse `*.parent.callFrames`.
// ================================================
const parseCallFrames = (obj = {}) => {
// Loop through call frames.
obj.callFrames?.forEach(({ url }) => {
// Add domains.
addDomain(url);
});
// Parent exists?
if (obj.parent) {
// Recursion.
parseCallFrames(obj.parent)
}
};
// =====================
// Loop through entries.
// =====================
jsonForFileInput.log.entries.forEach((entry) => {
// ========================
// Loop through initiators.
// ========================
// Add domains.
addDomain(entry._initiator.url);
addDomain(entry.request.url);
// Parse call frames.
parseCallFrames(entry._initiator.stack);
});
// ================
// Get output text.
// ================
const domainArray = Array.from(domainSet).sort();
const textForFileOutput = domainArray.join('\n');
// ==============
// Write to file.
// ==============
writeFileSync(pathForFileOutput, textForFileOutput, options);
// ===============
// Log completion.
// ===============
global.console.clear();
global.console.log(`Parsing complete. Domains written to file:\n\n${pathForFileOutput}\n`);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment