Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Automatically backup a MongoDB database to S3 using mongodump, tar, and awscli (Ubuntu 14.04 LTS)
#!/bin/sh
# Make sure to:
# 1) Name this file `backup.sh` and place it in /home/ubuntu
# 2) Run sudo apt-get install awscli to install the AWSCLI
# 3) Run aws configure (enter s3-authorized IAM user and specify region)
# 4) Fill in DB host + name
# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)
# 6) Run chmod +x backup.sh
# 7) Test it out via ./backup.sh
# 8) Set up a daily backup at midnight via `crontab -e`:
# 0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log
# DB host (secondary preferred as to avoid impacting primary performance)
HOST=db.example.com
# DB name
DBNAME=my-db
# S3 bucket name
BUCKET=s3-bucket-name
# Linux user account
USER=ubuntu
# Current time
TIME=`/bin/date +%d-%m-%Y-%T`
# Backup directory
DEST=/home/$USER/tmp
# Tar file of backup directory
TAR=$DEST/../$TIME.tar
# Create backup dir (-p to avoid warning if already exists)
/bin/mkdir -p $DEST
# Log
echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";
# Dump from mongodb host into backup directory
/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST
# Create tar of backup directory
/bin/tar cvf $TAR -C $DEST .
# Upload tar to s3
/usr/bin/aws s3 cp $TAR s3://$BUCKET/
# Remove tar file locally
/bin/rm -f $TAR
# Remove backup directory
/bin/rm -rf $DEST
# All done
echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"
Owner

eladnava commented Jun 7, 2016 edited

Restore from Backup

Download the .tar backup to the server from the S3 bucket via wget or curl:

wget -O backup.tar https://s3.amazonaws.com/my-bucket/xx-xx-xxxx-xx:xx:xx.tar

Alternatively, use the awscli to download it securely.

Then, extract the tar archive:

tar xvf backup.tar

Finally, import the backup into a MongoDB host:

mongorestore --host {db.example.com} --db {my-db} {db-name}/

Thanks @eladnava, this worked great for me. That said, I noticed that the script only uploaded the backup file to s3 if I ran it manually, not if it ran through the cron job. When it ran through the cron job, it was creating the backup file correctly but was not able to upload it to s3.

The reason for that is because the home folder for the cronjob is different than your user's home folder (~/), so it could not find the aws config with the s3 bucket information. To fix this, you have to do the following:
1- cd ~/ and note the home folder's path
2- specify the home folder's path from step 1 at the beginning of the file as follow: export HOME=/your/home/folder

For instance, in my case, I had to add the following line at the top of my file for the cron job to upload the backup file successfully: export HOME=/home/ubuntu

Great thanks,

add username and password for authentication. Also please not it is bad practice not to have a database without authentication.

/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST --PASSWORD $PASSWORD --USERNAME $USERNAME

You don't have to create the 2 files, you can stream it: https://gist.github.com/caraboides/7679bb73f4f13e36fc2b9dbded3c24c0

Owner

eladnava commented Jan 23, 2017

Interesting, thanks @wabirached! Not sure why it didn't happen to me.

@nesbtesh, good point!
@caraboides, cool, didn't know that was possible!

ir-fuel commented Jan 26, 2017

@nesbtesh --PASSWORD and --USERNAME need to be lowercase

this works well for small databases.
But mongodump actually doesn't do well when you have 5GB or more data to download.

A better approach for backup is to use filesystem (I didn't manage to create a script but will need to do).

Owner

eladnava commented Feb 6, 2017

@alexrada I use the script to back up a 7GB database daily and it's working pretty well for me.

Are you referring to LVM snapshots?

calvinh8 commented Feb 6, 2017

@eladnava Thank you for the script!
I got it working but have a few questions:

  1. Do I need to write separate script for different database in the same MongoDB server for backup?
  2. Do I just need to only run this script on primary (d1.example.com) replica or all?

I got my MongoDB setup on AWS EC2 thanks to your article!! Helps a lot!

Hello, I'm using an instance with several bases, I should have at least 300 GB, is it viable to use the dump and back up in S3?

you probably want to add set -e as second line in the script to avoid it happily continuing on errors

and a simpler way to do all this is: mongodump --archive --gzip | aws s3 cp - s3://my-bucket/some-file

muhzi4u commented Jun 3, 2017

I am getting an error.
A client error (InvalidRequest) occurred when calling the CreateMultipartUpload operation: The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.

Small mods to make this work with array of db names:

#!/bin/bash

# Make sure to:
# 1) Name this file `backup.sh` and place it in /home/ubuntu
# 2) Run sudo apt-get install awscli to install the AWSCLI
# 3) Run aws configure (enter s3-authorized IAM user and specify region)
# 4) Fill in DB host + name
# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)
# 6) Run chmod +x backup.sh
# 7) Test it out via ./backup.sh
# 8) Set up a daily backup at midnight via `crontab -e`:
#    0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log

# DB host (secondary preferred as to avoid impacting primary performance)
HOST=localhost

# DB name
DBNAMES=("db1" "db2" "db3")

# S3 bucket name
BUCKET=bucket

# Linux user account
USER=ubuntu

# Current time
TIME=`/bin/date +%d-%m-%Y-%T`

# Backup directory
DEST=/home/$USER/tmp

# Tar file of backup directory
TAR=$DEST/../$TIME.tar

# Create backup dir (-p to avoid warning if already exists)
/bin/mkdir -p $DEST

# Log
echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";

# Dump from mongodb host into backup directory
for DBNAME in "${DBNAMES[@]}"
do
   /usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST
done

# Create tar of backup directory
/bin/tar cvf $TAR -C $DEST .

# Upload tar to s3
/usr/bin/aws s3 cp $TAR s3://$BUCKET/

# Remove tar file locally
/bin/rm -f $TAR

# Remove backup directory
/bin/rm -rf $DEST

# All done
echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"

I'd propose /bin/date -u +"%Y-%m-%dT%H%M%SZ" as a time format. It's sortable and doesn't contain : character

Owner

eladnava commented Jul 22, 2017

@calvinh8 Yes, this script only backs up a single database. Create multiple scripts for multiple databases, or check out @cjgordon's answer. And no, your replica set members should contain the same data, so it is only necessary to back up from one member. By default, the primary will only accept reads so specify the primary in the script.

@andfilipe1 Please give it a try and let us know.

@francisdb Thanks for the tips 👍

@muhzi4u You probably have an outdated awscli package, update it using apt-get.

@cjgordon Nicely done. 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment