Skip to content

Instantly share code, notes, and snippets.

@wswoodruff
Last active October 3, 2019 14:19
Show Gist options
  • Save wswoodruff/4fc0cc48480aa2ae7ce7c7091bf1b476 to your computer and use it in GitHub Desktop.
Save wswoodruff/4fc0cc48480aa2ae7ce7c7091bf1b476 to your computer and use it in GitHub Desktop.
'use strict';
const Fs = require('fs');
const Util = require('util');
const Xml2js = require('xml2js');
const XmlNodes = require('xml-nodes');
const Miss = require('mississippi');
const internals = {};
module.exports = async ({
filePath,
fileExtension,
targetNode,
rootNode = 'root',
splitOnNumber = 400,
nodeFilter
}) => {
if (!filePath || !fileExtension) {
throw new Error('Invalid arguments: filePath, fileExtension are required');
}
const { filterNodes } = internals;
const fileStream = Fs.createReadStream(`${filePath}.${fileExtension}`);
const xmlHead = '<?xml version="1.0" encoding="UTF-8"?>';
let currentFile = '';
let fileCount = 0;
let currentFileTargetNodeCount = 0;
const passThrough = (nodeAsStr, enc, next) => next(null, nodeAsStr);
const processNode = async (nodeAsStr, enc, next) => {
currentFile += nodeAsStr;
currentFileTargetNodeCount++;
if (currentFileTargetNodeCount < splitOnNumber) {
return next();
}
currentFileTargetNodeCount = 0;
const newXmlFile = `${xmlHead}\n<${rootNode}>\n${currentFile}\n</${rootNode}>`;
currentFile = '';
try {
await Util.promisify(Fs.writeFile)(`${filePath}-${fileCount++}.${fileExtension}`, newXmlFile);
next();
}
catch (err) {
next(err);
}
};
await Util.promisify(Miss.pipe)(
fileStream,
XmlNodes(targetNode),
nodeFilter ? Miss.through(filterNodes(nodeFilter)) : Miss.through(passThrough),
Miss.through(processNode)
);
// Create file for remainder
if (currentFileTargetNodeCount > 0) {
const newXmlFile = `${xmlHead}\n<${rootNode}>\n${currentFile}\n</${rootNode}>`;
await Util.promisify(Fs.writeFile)(`${filePath}-${fileCount++}.${fileExtension}`, newXmlFile);
}
return fileCount;
};
internals.filterNodes = (filterFunc) => {
return async (nodeAsStr, enc, next) => {
const xmlParser = new Xml2js.Parser();
try {
const keep = await filterFunc(await xmlParser.parseStringPromise(nodeAsStr));
// Let it pass on thru or not
keep ? next(null, nodeAsStr) : next();
}
catch (err) {
next(err);
}
};
};
@wswoodruff
Copy link
Author

Yoooo thanks for comments!

  • Yeah we should probably validate targetNode exists, I think XmlNodes will choke on weird or no input
  • I was trying to only create a parser if the filterNodes was going to be used since it's optional, but yeah turns out I'm creating one per-node — loL oops! That doesn't seem right I should fix that
  • The nested await for await filterFunc(await xmlParser.parseStringPromise(nodeAsStr)) — This behaves the same as creating an intermediate var for await xmlParser.parseStringPromise(nodeAsStr) on the line above and passing it to filterFunc
  • Yeah — seems like I could break out a write file func that could be shared between the final file write and processNode — the file write line still needs to mutate fileCount which lives in the main exported scope so for now I think it should be declared in that scope — dunno how I feel about passing a var to a func that then mutates it but maybe that could work too if there was a comment, dunno not sure yet /shrug

@zemccartney
Copy link

werd, all makes sense. thanks for the explanations 🙏 🍷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment