Skip to content

Instantly share code, notes, and snippets.

@mshakhomirov
Created November 28, 2021 15:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mshakhomirov/18eef36fd2c5d0e1ffacab28824877c3 to your computer and use it in GitHub Desktop.
Save mshakhomirov/18eef36fd2c5d0e1ffacab28824877c3 to your computer and use it in GitHub Desktop.
// Count processed rows and batches and compare it with TotlaRecords (which comes from getSize() function beforehand)
let recordsProcessed = 0;
let batchNumber = 1;
let batchRecords = [];
...
// save in batch mode:
else if (output === 's3') {
s1.on('result', (row) => {
++recordsProcessed;
batchRecords.push(row);
if (recordsProcessed === (BATCH_SIZE * batchNumber) || recordsProcessed === totalRecords) {
if (!dryRun) {
connection.pause();
pr(` batch ${batchNumber}, pushing to s3, ${batchRecords.length}, totalRecordsProcessed = ${recordsProcessed}`);
// batch process here. saves in batch mode, split file into smaller files.
const params = { Bucket: bucket, Key: s3key + batchNumber, Body: JSON.stringify(batchRecords) };
// save chunk to s3
s3.upload(params).promise()
.then(data => { connection.resume(); });
pr(`saved ${batchNumber} to aws s3 cp s3://${bucket}/${s3key}${batchNumber} ./tmp/${s3key}${batchNumber}.csv`);
}
batchRecords = [];
++batchNumber;
}
});
s1.on('end', () => { resolve(recordsProcessed); });
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment