Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Automatically backup a MongoDB database to S3 using mongodump, tar, and awscli (Ubuntu 14.04 LTS)
#!/bin/sh
# Make sure to:
# 1) Name this file `backup.sh` and place it in /home/ubuntu
# 2) Run sudo apt-get install awscli to install the AWSCLI
# 3) Run aws configure (enter s3-authorized IAM user and specify region)
# 4) Fill in DB host + name
# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)
# 6) Run chmod +x backup.sh
# 7) Test it out via ./backup.sh
# 8) Set up a daily backup at midnight via `crontab -e`:
# 0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log
# DB host (secondary preferred as to avoid impacting primary performance)
HOST=db.example.com
# DB name
DBNAME=my-db
# S3 bucket name
BUCKET=s3-bucket-name
# Linux user account
USER=ubuntu
# Current time
TIME=`/bin/date +%d-%m-%Y-%T`
# Backup directory
DEST=/home/$USER/tmp
# Tar file of backup directory
TAR=$DEST/../$TIME.tar
# Create backup dir (-p to avoid warning if already exists)
/bin/mkdir -p $DEST
# Log
echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";
# Dump from mongodb host into backup directory
/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST
# Create tar of backup directory
/bin/tar cvf $TAR -C $DEST .
# Upload tar to s3
/usr/bin/aws s3 cp $TAR s3://$BUCKET/
# Remove tar file locally
/bin/rm -f $TAR
# Remove backup directory
/bin/rm -rf $DEST
# All done
echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"
@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava Jun 7, 2016

Restore from Backup

Download the .tar backup to the server from the S3 bucket via wget or curl:

wget -O backup.tar https://s3.amazonaws.com/my-bucket/xx-xx-xxxx-xx:xx:xx.tar

Alternatively, use the awscli to download it securely.

Then, extract the tar archive:

tar xvf backup.tar

Finally, import the backup into a MongoDB host:

mongorestore --host {db.example.com} --db {my-db} {db-name}/
Owner

eladnava commented Jun 7, 2016

Restore from Backup

Download the .tar backup to the server from the S3 bucket via wget or curl:

wget -O backup.tar https://s3.amazonaws.com/my-bucket/xx-xx-xxxx-xx:xx:xx.tar

Alternatively, use the awscli to download it securely.

Then, extract the tar archive:

tar xvf backup.tar

Finally, import the backup into a MongoDB host:

mongorestore --host {db.example.com} --db {my-db} {db-name}/
@wabirached

This comment has been minimized.

Show comment
Hide comment
@wabirached

wabirached Jan 11, 2017

Thanks @eladnava, this worked great for me. That said, I noticed that the script only uploaded the backup file to s3 if I ran it manually, not if it ran through the cron job. When it ran through the cron job, it was creating the backup file correctly but was not able to upload it to s3.

The reason for that is because the home folder for the cronjob is different than your user's home folder (~/), so it could not find the aws config with the s3 bucket information. To fix this, you have to do the following:
1- cd ~/ and note the home folder's path
2- specify the home folder's path from step 1 at the beginning of the file as follow: export HOME=/your/home/folder

For instance, in my case, I had to add the following line at the top of my file for the cron job to upload the backup file successfully: export HOME=/home/ubuntu

Thanks @eladnava, this worked great for me. That said, I noticed that the script only uploaded the backup file to s3 if I ran it manually, not if it ran through the cron job. When it ran through the cron job, it was creating the backup file correctly but was not able to upload it to s3.

The reason for that is because the home folder for the cronjob is different than your user's home folder (~/), so it could not find the aws config with the s3 bucket information. To fix this, you have to do the following:
1- cd ~/ and note the home folder's path
2- specify the home folder's path from step 1 at the beginning of the file as follow: export HOME=/your/home/folder

For instance, in my case, I had to add the following line at the top of my file for the cron job to upload the backup file successfully: export HOME=/home/ubuntu

@nesbtesh

This comment has been minimized.

Show comment
Hide comment
@nesbtesh

nesbtesh Jan 11, 2017

Great thanks,

add username and password for authentication. Also please not it is bad practice not to have a database without authentication.

/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST --PASSWORD $PASSWORD --USERNAME $USERNAME

Great thanks,

add username and password for authentication. Also please not it is bad practice not to have a database without authentication.

/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST --PASSWORD $PASSWORD --USERNAME $USERNAME

@caraboides

This comment has been minimized.

Show comment
Hide comment
@caraboides

caraboides Jan 22, 2017

You don't have to create the 2 files, you can stream it: https://gist.github.com/caraboides/7679bb73f4f13e36fc2b9dbded3c24c0

You don't have to create the 2 files, you can stream it: https://gist.github.com/caraboides/7679bb73f4f13e36fc2b9dbded3c24c0

@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava Jan 23, 2017

Interesting, thanks @wabirached! Not sure why it didn't happen to me.

@nesbtesh, good point!
@caraboides, cool, didn't know that was possible!

Owner

eladnava commented Jan 23, 2017

Interesting, thanks @wabirached! Not sure why it didn't happen to me.

@nesbtesh, good point!
@caraboides, cool, didn't know that was possible!

@ir-fuel

This comment has been minimized.

Show comment
Hide comment
@ir-fuel

ir-fuel Jan 26, 2017

@nesbtesh --PASSWORD and --USERNAME need to be lowercase

ir-fuel commented Jan 26, 2017

@nesbtesh --PASSWORD and --USERNAME need to be lowercase

@alexrada

This comment has been minimized.

Show comment
Hide comment
@alexrada

alexrada Jan 28, 2017

this works well for small databases.
But mongodump actually doesn't do well when you have 5GB or more data to download.

A better approach for backup is to use filesystem (I didn't manage to create a script but will need to do).

this works well for small databases.
But mongodump actually doesn't do well when you have 5GB or more data to download.

A better approach for backup is to use filesystem (I didn't manage to create a script but will need to do).

@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava Feb 6, 2017

@alexrada I use the script to back up a 7GB database daily and it's working pretty well for me.

Are you referring to LVM snapshots?

Owner

eladnava commented Feb 6, 2017

@alexrada I use the script to back up a 7GB database daily and it's working pretty well for me.

Are you referring to LVM snapshots?

@calvinh8

This comment has been minimized.

Show comment
Hide comment
@calvinh8

calvinh8 Feb 6, 2017

@eladnava Thank you for the script!
I got it working but have a few questions:

  1. Do I need to write separate script for different database in the same MongoDB server for backup?
  2. Do I just need to only run this script on primary (d1.example.com) replica or all?

I got my MongoDB setup on AWS EC2 thanks to your article!! Helps a lot!

calvinh8 commented Feb 6, 2017

@eladnava Thank you for the script!
I got it working but have a few questions:

  1. Do I need to write separate script for different database in the same MongoDB server for backup?
  2. Do I just need to only run this script on primary (d1.example.com) replica or all?

I got my MongoDB setup on AWS EC2 thanks to your article!! Helps a lot!

@andfilipe1

This comment has been minimized.

Show comment
Hide comment
@andfilipe1

andfilipe1 Mar 24, 2017

Hello, I'm using an instance with several bases, I should have at least 300 GB, is it viable to use the dump and back up in S3?

Hello, I'm using an instance with several bases, I should have at least 300 GB, is it viable to use the dump and back up in S3?

@francisdb

This comment has been minimized.

Show comment
Hide comment
@francisdb

francisdb Apr 19, 2017

you probably want to add set -e as second line in the script to avoid it happily continuing on errors

you probably want to add set -e as second line in the script to avoid it happily continuing on errors

@francisdb

This comment has been minimized.

Show comment
Hide comment
@francisdb

francisdb Apr 19, 2017

and a simpler way to do all this is: mongodump --archive --gzip | aws s3 cp - s3://my-bucket/some-file

and a simpler way to do all this is: mongodump --archive --gzip | aws s3 cp - s3://my-bucket/some-file

@muhzi4u

This comment has been minimized.

Show comment
Hide comment
@muhzi4u

muhzi4u Jun 3, 2017

I am getting an error.
A client error (InvalidRequest) occurred when calling the CreateMultipartUpload operation: The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.

muhzi4u commented Jun 3, 2017

I am getting an error.
A client error (InvalidRequest) occurred when calling the CreateMultipartUpload operation: The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.

@cjgordon

This comment has been minimized.

Show comment
Hide comment
@cjgordon

cjgordon Jun 23, 2017

Small mods to make this work with array of db names:

#!/bin/bash

# Make sure to:
# 1) Name this file `backup.sh` and place it in /home/ubuntu
# 2) Run sudo apt-get install awscli to install the AWSCLI
# 3) Run aws configure (enter s3-authorized IAM user and specify region)
# 4) Fill in DB host + name
# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)
# 6) Run chmod +x backup.sh
# 7) Test it out via ./backup.sh
# 8) Set up a daily backup at midnight via `crontab -e`:
#    0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log

# DB host (secondary preferred as to avoid impacting primary performance)
HOST=localhost

# DB name
DBNAMES=("db1" "db2" "db3")

# S3 bucket name
BUCKET=bucket

# Linux user account
USER=ubuntu

# Current time
TIME=`/bin/date +%d-%m-%Y-%T`

# Backup directory
DEST=/home/$USER/tmp

# Tar file of backup directory
TAR=$DEST/../$TIME.tar

# Create backup dir (-p to avoid warning if already exists)
/bin/mkdir -p $DEST

# Log
echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";

# Dump from mongodb host into backup directory
for DBNAME in "${DBNAMES[@]}"
do
   /usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST
done

# Create tar of backup directory
/bin/tar cvf $TAR -C $DEST .

# Upload tar to s3
/usr/bin/aws s3 cp $TAR s3://$BUCKET/

# Remove tar file locally
/bin/rm -f $TAR

# Remove backup directory
/bin/rm -rf $DEST

# All done
echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"

Small mods to make this work with array of db names:

#!/bin/bash

# Make sure to:
# 1) Name this file `backup.sh` and place it in /home/ubuntu
# 2) Run sudo apt-get install awscli to install the AWSCLI
# 3) Run aws configure (enter s3-authorized IAM user and specify region)
# 4) Fill in DB host + name
# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)
# 6) Run chmod +x backup.sh
# 7) Test it out via ./backup.sh
# 8) Set up a daily backup at midnight via `crontab -e`:
#    0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log

# DB host (secondary preferred as to avoid impacting primary performance)
HOST=localhost

# DB name
DBNAMES=("db1" "db2" "db3")

# S3 bucket name
BUCKET=bucket

# Linux user account
USER=ubuntu

# Current time
TIME=`/bin/date +%d-%m-%Y-%T`

# Backup directory
DEST=/home/$USER/tmp

# Tar file of backup directory
TAR=$DEST/../$TIME.tar

# Create backup dir (-p to avoid warning if already exists)
/bin/mkdir -p $DEST

# Log
echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";

# Dump from mongodb host into backup directory
for DBNAME in "${DBNAMES[@]}"
do
   /usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST
done

# Create tar of backup directory
/bin/tar cvf $TAR -C $DEST .

# Upload tar to s3
/usr/bin/aws s3 cp $TAR s3://$BUCKET/

# Remove tar file locally
/bin/rm -f $TAR

# Remove backup directory
/bin/rm -rf $DEST

# All done
echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"
@iamtankist

This comment has been minimized.

Show comment
Hide comment
@iamtankist

iamtankist Jul 17, 2017

I'd propose /bin/date -u +"%Y-%m-%dT%H%M%SZ" as a time format. It's sortable and doesn't contain : character

I'd propose /bin/date -u +"%Y-%m-%dT%H%M%SZ" as a time format. It's sortable and doesn't contain : character

@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava Jul 22, 2017

@calvinh8 Yes, this script only backs up a single database. Create multiple scripts for multiple databases, or check out @cjgordon's answer. And no, your replica set members should contain the same data, so it is only necessary to back up from one member. By default, the primary will only accept reads so specify the primary in the script.

@andfilipe1 Please give it a try and let us know.

@francisdb Thanks for the tips 👍

@muhzi4u You probably have an outdated awscli package, update it using apt-get.

@cjgordon Nicely done. 💯

Owner

eladnava commented Jul 22, 2017

@calvinh8 Yes, this script only backs up a single database. Create multiple scripts for multiple databases, or check out @cjgordon's answer. And no, your replica set members should contain the same data, so it is only necessary to back up from one member. By default, the primary will only accept reads so specify the primary in the script.

@andfilipe1 Please give it a try and let us know.

@francisdb Thanks for the tips 👍

@muhzi4u You probably have an outdated awscli package, update it using apt-get.

@cjgordon Nicely done. 💯

@roysG

This comment has been minimized.

Show comment
Hide comment
@roysG

roysG Aug 16, 2017

@andfilipe1 , this will answer you question better :).

According to mongo site:

"mongodump reads data from a MongoDB database and creates high fidelity BSON files which the mongorestore tool can use to populate a MongoDB database. mongodump and mongorestore are simple and efficient tools for backing up and restoring small MongoDB deployments, but are not ideal for capturing backups of larger systems."

https://docs.mongodb.com/manual/core/backups/

roysG commented Aug 16, 2017

@andfilipe1 , this will answer you question better :).

According to mongo site:

"mongodump reads data from a MongoDB database and creates high fidelity BSON files which the mongorestore tool can use to populate a MongoDB database. mongodump and mongorestore are simple and efficient tools for backing up and restoring small MongoDB deployments, but are not ideal for capturing backups of larger systems."

https://docs.mongodb.com/manual/core/backups/

@sreekanth1990

This comment has been minimized.

Show comment
Hide comment
@sreekanth1990

sreekanth1990 May 6, 2018

Hi how to solve error Invalid " endpoint: https://s3.US East.amazonaws.com"

Hi how to solve error Invalid " endpoint: https://s3.US East.amazonaws.com"

@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava May 6, 2018

@sreekanth1990 Please double check your S3 URL for storing the backup and make sure your DNS server is accessible and working properly.

Owner

eladnava commented May 6, 2018

@sreekanth1990 Please double check your S3 URL for storing the backup and make sure your DNS server is accessible and working properly.

@ps-34

This comment has been minimized.

Show comment
Hide comment
@ps-34

ps-34 May 6, 2018

@eladnava How to backup some part of collection, condition would be older than 3 months documents need to be backed up.
How to do that?

ps-34 commented May 6, 2018

@eladnava How to backup some part of collection, condition would be older than 3 months documents need to be backed up.
How to do that?

@eladnava

This comment has been minimized.

Show comment
Hide comment
@eladnava

eladnava May 6, 2018

@ps-34 You would need to write a custom script for that. Query the database in the script and save the result set to local disk or S3. You would then need a restore script as well.

Owner

eladnava commented May 6, 2018

@ps-34 You would need to write a custom script for that. Query the database in the script and save the result set to local disk or S3. You would then need a restore script as well.

@stanvarlamov

This comment has been minimized.

Show comment
Hide comment
@stanvarlamov

stanvarlamov May 8, 2018

I think that the one-liner suggested by @francisdb 1 year ago is the way to go. The "versioning" and life cycle should be auto-managed by S3: use a pre-set, hard-coded archive name in the script. The restore script would take the object's version as the argument (vs. the timestamp). There can also be a simple parsing step coded to get the version by date from the list.
This approach should also take care of accidental multiple or parallel processes as each invocation should technically create a separate S3 object's version. Multi-part upload "emulation" can be achieved by splitting the dump by DB name as suggested above

stanvarlamov commented May 8, 2018

I think that the one-liner suggested by @francisdb 1 year ago is the way to go. The "versioning" and life cycle should be auto-managed by S3: use a pre-set, hard-coded archive name in the script. The restore script would take the object's version as the argument (vs. the timestamp). There can also be a simple parsing step coded to get the version by date from the list.
This approach should also take care of accidental multiple or parallel processes as each invocation should technically create a separate S3 object's version. Multi-part upload "emulation" can be achieved by splitting the dump by DB name as suggested above

@adityaachaubey

This comment has been minimized.

Show comment
Hide comment
@adityaachaubey

adityaachaubey May 30, 2018

works fine on ubuntu 16.04 as well ! Thanks

adityaachaubey commented May 30, 2018

works fine on ubuntu 16.04 as well ! Thanks

@piggydoughnut

This comment has been minimized.

Show comment
Hide comment
@piggydoughnut

piggydoughnut Jul 6, 2018

works like a charm 😸 ❤️ Thank you

works like a charm 😸 ❤️ Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment