Skip to content

Instantly share code, notes, and snippets.

@twolfson
Last active July 28, 2021 22:52
Show Gist options
  • Save twolfson/df899bbd8764f54f1891e261d437b3a8 to your computer and use it in GitHub Desktop.
Save twolfson/df899bbd8764f54f1891e261d437b3a8 to your computer and use it in GitHub Desktop.
Quick and dirty database dump to S3 via Node.js

We are implementing database dumping which is straightforward but can be tedious to setup. Here's our setup:

  1. Create AWS user for db backups (e.g. db-backups-{{app}})
    • Save credentials in a secure location
    • If adding db scrubbing, use a separate user (e.g db-scrubs-{{app}})
  2. Create bucket for S3 access logging (e.g. s3-access-log-{{app}})
  3. Create consistently named bucket for db dumps (e.g. db-backups-{{app}})
    • Enable logging to s3-access-log-{{app}} with prefix of db-backups-{{app}}
  4. Add IAM policy for bucket access
  5. Upload a dump to S3 via our script
    • node backup-local-db.js
// Based on: https://gist.github.com/twolfson/f5d8adead6def0b55663
// Load in our dependencies
var assert = require('assert');
var fs = require('fs');
var AWS = require('aws-sdk');
var spawn = require('child_process').spawn;
// Define our constants upfront
var dbName = 'dbName';
var S3_BUCKET = 'S3_BUCKET';
var s3AccessKeyId = 'S3_ACCESS_KEY_ID';
var s3SecretAccessKey = 'S3_SECRET_ACCESS_KEY';
// Determine our filename
// 20170312.011924.307000000.sql.gz
var timestamp = (new Date()).toISOString()
.replace(/^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2}).(\d{3})Z$/, '$1$2$3.$4$5$6.$7000000');
var filepath = timestamp + '.sql.gz';
// Configure AWS credentials
// http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-node-credentials-environment.html
// DEV: There's likely a better non-environment way to do this but it's not well documented
process.env.AWS_ACCESS_KEY_ID = s3AccessKeyId;
process.env.AWS_SECRET_ACCESS_KEY = s3SecretAccessKey;
// Define our S3 connection
// https://aws.amazon.com/sdk-for-node-js/
// http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
var s3 = new AWS.S3();
// Dump our database to a file so we can collect its length
// DEV: We output `stderr` to `process.stderr`
// DEV: We write to disk so S3 client can calculate `Content-Length` of final result before uploading
console.log('Dumping `pg_dump` into `gzip`');
var pgDumpChild = spawn('pg_dump', [dbName], {stdio: ['ignore', 'pipe', 'inherit']});
pgDumpChild.on('exit', function (code) {
if (code !== 0) {
throw new Error('pg_dump: Bad exit code (' + code + ')');
}
});
var gzipChild = spawn('gzip', {stdio: ['pipe', 'pipe', 'inherit']});
gzipChild.on('exit', function (code) {
if (code !== 0) {
throw new Error('gzip: Bad exit code (' + code + ')');
}
});
var writeStream = fs.createWriteStream(filepath);
pgDumpChild.stdout.pipe(gzipChild.stdin);
gzipChild.stdout.pipe(writeStream);
// When our write stream is completed
writeStream.on('finish', function handleFinish () {
// Upload our gzip stream into S3
// http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
console.log('Uploading "' + filepath + '" to S3');
s3.putObject({
Bucket: S3_BUCKET,
Key: filepath,
ACL: 'private',
ContentType: 'text/plain',
ContentEncoding: 'gzip',
Body: fs.createReadStream(filepath)
}, function handlePutObject (err, data) {
// If there was an error, throw it
if (err) {
throw err;
// Otherwise, log success
} else {
console.log('Successfully uploaded "' + filepath + '"');
}
});
});
@t3h2mas
Copy link

t3h2mas commented Nov 20, 2018

Is there a reason why this isn't using the zlib library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment