|# Make sure to:|
|# 1) Name this file `backup.sh` and place it in /home/ubuntu|
|# 2) Run sudo apt-get install awscli to install the AWSCLI|
|# 3) Run aws configure (enter s3-authorized IAM user and specify region)|
|# 4) Fill in DB host + name|
|# 5) Create S3 bucket for the backups and fill it in below (set a lifecycle rule to expire files older than X days in the bucket)|
|# 6) Run chmod +x backup.sh|
|# 7) Test it out via ./backup.sh|
|# 8) Set up a daily backup at midnight via `crontab -e`:|
|# 0 0 * * * /home/ubuntu/backup.sh > /home/ubuntu/backup.log|
|# DB host (secondary preferred as to avoid impacting primary performance)|
|# DB name|
|# S3 bucket name|
|# Linux user account|
|# Current time|
|# Backup directory|
|# Tar file of backup directory|
|# Create backup dir (-p to avoid warning if already exists)|
|/bin/mkdir -p $DEST|
|echo "Backing up $HOST/$DBNAME to s3://$BUCKET/ on $TIME";|
|# Dump from mongodb host into backup directory|
|/usr/bin/mongodump -h $HOST -d $DBNAME -o $DEST|
|# Create tar of backup directory|
|/bin/tar cvf $TAR -C $DEST .|
|# Upload tar to s3|
|/usr/bin/aws s3 cp $TAR s3://$BUCKET/|
|# Remove tar file locally|
|/bin/rm -f $TAR|
|# Remove backup directory|
|/bin/rm -rf $DEST|
|# All done|
|echo "Backup available at https://s3.amazonaws.com/$BUCKET/$TIME.tar"|
Restore from Backup
Alternatively, use the
Then, extract the tar archive:
Finally, import the backup into a MongoDB host:
Thanks @eladnava, this worked great for me. That said, I noticed that the script only uploaded the backup file to s3 if I ran it manually, not if it ran through the cron job. When it ran through the cron job, it was creating the backup file correctly but was not able to upload it to s3.
The reason for that is because the home folder for the cronjob is different than your user's home folder (
For instance, in my case, I had to add the following line at the top of my file for the cron job to upload the backup file successfully:
You don't have to create the 2 files, you can stream it: https://gist.github.com/caraboides/7679bb73f4f13e36fc2b9dbded3c24c0
@eladnava Thank you for the script!
I got my MongoDB setup on AWS EC2 thanks to your article!! Helps a lot!
Small mods to make this work with array of db names:
@calvinh8 Yes, this script only backs up a single database. Create multiple scripts for multiple databases, or check out @cjgordon's answer. And no, your replica set members should contain the same data, so it is only necessary to back up from one member. By default, the primary will only accept reads so specify the primary in the script.
@andfilipe1 Please give it a try and let us know.
@francisdb Thanks for the tips
@muhzi4u You probably have an outdated
@cjgordon Nicely done.
@andfilipe1 , this will answer you question better :).
According to mongo site:
"mongodump reads data from a MongoDB database and creates high fidelity BSON files which the mongorestore tool can use to populate a MongoDB database. mongodump and mongorestore are simple and efficient tools for backing up and restoring small MongoDB deployments, but are not ideal for capturing backups of larger systems."
I think that the one-liner suggested by @francisdb 1 year ago is the way to go. The "versioning" and life cycle should be auto-managed by S3: use a pre-set, hard-coded archive name in the script. The restore script would take the object's version as the argument (vs. the timestamp). There can also be a simple parsing step coded to get the version by date from the list.
script is great yet I have had a problem with uploading to s3, backup is nearly 80mb and it brokes connection, so I switched from aws-cli to s3cmd and it's working perfectly fine
in shell file:
backup and restore works great for me. I just have one question - I have many databases and was wondering if there is a way to backup all databases without defining a list of databases? or recommendation to backup separately for restore purposes? whats the best practice
thanks @eladnava, yep it worked out really well.
the 2nd part of the question - generally do you prefer to separate tar for each database or one tar bundle all databases? I know the script can do either way but I am thinking it may be better to save each database backup separately in S3?