Skip to content

Instantly share code, notes, and snippets.

@agungf
Created February 7, 2014 04:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save agungf/8857502 to your computer and use it in GitHub Desktop.
Save agungf/8857502 to your computer and use it in GitHub Desktop.
backing up mongodb to rackspace file
Backing up your server (MongoDB) to the cloud
Submitted by admin on Wed, 01/02/2013 - 19:59
In my previous blog post I detailed how to set-up a new server to host a small Play2 application securely using Nginx and MongoDB.
Now our application is up and running the next step is to configure backups. Alot of marketing suggests that cloud based VPS/Servers are a golden bullet with massive uptime, no worries of data loss, scalability etc etc. With the configuration I detailed previously we are using a single server on Rackspace's cloud service. Our server instance, and as is often the case is hosted on a single physical machine with no redundancy. This means if the physical host hosting our virtualised server goes down unexpectedly we can potentially loose all out data. This does happen, although not frequently I have had a Rackspace cloud server go down, luckily it was migrated to a new machine with no dataloss within 3 hours but if the issues was with the hosts drives this could of been different.
Some providers will store your data on a Storage-Area-Network (SAN) instead of storing your server instance on the physical hosts hard disk. This offers further redundancy against data loss but comes at an additional cost. With this in mind you should backup your data outside of your server or any other server on the same physical host.
The backup strategy I have taken is to take nightly database backups of my Mongo database and sync them to Rackspace cloud files. This is the most basic form of backups but more than suitable for a small application which isn't business critical. In a worst case scenario there is a fault on my servers physical host which is unrecoverable:
I have my source code on GitHub. Each release is tagged so I can get a particular version to deploy
My nginx configuration is under my application project on GitHub
I have nightly database backups on Rackspace Cloud files. My application has little writes so losing 24hr's worth of data although not desirable isn't a huge issue.
With the above I can fully restore my application, I have my server configuration and I have the required application files and data. The biggest pain is manually setting up a new server installing the packages and creating users etc. This could be automated with tools such as Puppet which I recommend but will not cover this here.
Step 1, create a mongo backup script
Mongo comes with a utility called mongodump, for large databases this may not be the best solution, for smaller databases it should be ideal.
We want our backups to be placed under /var/backups/mongodb . So lets create that directory
$ sudo mkdir -p /var/backups/mongodb
To keep this simple we will have the backup task run as the root user. This is so we don't need to set the owner/group on the backup directory we just created.
Let's create our backup script. We will place this in the root user's home directory, in a hidden folder so it's not accidentally deleted. We place it in the root users home directory as the root user will be running this script via cron.
$ sudo mkdir /root/.scripts
$ sudo touch /root/.scripts/backupMongo.sh
$ sudo chmod +x /root/.scripts/backupMongo.sh
Open the backup script with vi or your favorite text editor and paste in the below replacing db with your mongodb name.
$ vi /root/.scripts/backupMongo.sh
#!/bin/bash
DB=mydb
BACKUP_DIR=/var/backups/mongodb
DATE=`date +"%F"`
if [ ! -d "$BACKUP_DIR" ]; then
mkdir -p $BACKUP_DIR
fi
mongodump --db $DB --out /tmp/${DB}db-$DATE
tar czf $BACKUP_DIR/${DB}db-$DATE.tgz -C /tmp ${DB}db-$DATE
rm -rf /tmp/${DB}db-$DATE
find $BACKUP_DIR -mtime +14 -exec rm {} \;
When the above is done we should be able to run the script which will create a dated backup in the format DBNAME-YEAR-MONTH-DATE under the backup directory specified. Using this date format when you do a ls in the terminal they are ordered by creation date, latest being at the top. Also note the script deletes backups older than 14 days old.
$ sudo /root/.scripts/backupMongo.sh
Check your backup directory and you should see the backup we just created.
$ ls /var/backups/mongodb/
mydb-2013-01-02.tgz
Now we have our script, lets get it to run every night. Open up crontab as root:
$ sudo crotab -e
Paste in the below to the bottom of your crontab file, save and exit.
@daily /root/.scripts/backupMongo.sh > /dev/null
When the above is done nightly database snapshots will be taken with 14 days retention and then stored under /var/backups/mongodb . Lets now synchronise these backups into the cloud.
Step 2, Synchronising to Rackspace cloud files.
Rackspace use fuse which is a restful api for managing there cloud file service. Using a opensource project Cloudfuse we can mount Rackspace cloudfiles as a local disc. Firstly we need to build cloudfuse and install it.
To build cloudfuse we have some dependencies to install. Firstly let's install them.
apt-get install build-essential libcurl4-openssl-dev libxml2-dev \
libssl-dev libfuse-dev git
When the above is done lets download the cloudfuse source code.
$ cd ~
$ git clone https://github.com/redbo/cloudfuse.git
$ cd cloudfuse
When cloud fuse is downloaded, from within the cloudfuse directory run the below commands
$ ./configure
$ ./make
$ ./makeinstall
That should be it, hopefully you get no errors. You should add yourself to the fuse user group. Run the below replacing username with your username.
$ sudo usermod -a -G fuse username
We want to mount our cloud files at the path /mnt/rackspacefiles , so lets create that.
$ sudo mkdir /mnt/rackspacefiles
We are now good to write our script which will mount the remote file system, sync our backups across and deletes any remote backups over 14 days old. Let's create the script /root/.scripts/syncToRackspace.sh
$ sudo touch /root/.scripts/syncToRackspace.sh
$ sudo chmod +x /root/.scripts/syncToRackspace.sh
With the file created lets open it and paste in the below
$ sudo vi /root/.scripts/syncToRackspace.sh
#!/bin/bash
RACKSPACE_USERNAME=username
RACKSPACE_APIKEY=....
RACKSPACE_AUTHURL=https://lon.auth.api.rackspacecloud.com/v1.0
cloudfuse -o username=$RACKSPACE_USERNAME,api_key=$RACKSPACE_APIKEY,authurl=$RACKSPACE_AUTHURL /mnt/rackspacefiles
if [ ! -d "/mnt/rackspacefiles/MongoBackups" ]; then
mkdir /mnt/rackspacefiles/MongoBackups
fi
rsync -r --ignore-times --checksum --delete /var/backups/mongodb /mnt/rackspacefiles/MongoBackups
fusermount -u /mnt/rackspacefiles
In the above script replace the RACKSPACE_USERNAME and RACKSPACE_APIKEY with your username and apikey which you get from the Rackspace control panel. If you are not using a UK server but US replace RACKSPACE_AUTHUTL with https://identity.api.rackspacecloud.com/v1.0 .
The above script will mount the remote file system, create a directory MongoBackups if it doesn't exist on the remote file system and then RSYNC across our local backups directory. The fuse file system does not support timestamps so on the rsync command we disable checking for modifications against timestamps. After the backup is completed we unmount the remote file system.
If you have backups in your /var/backups/mongodb directory we should now be able to run our script from above, afterwards the files should be visible from the Rackspace control panel.
$ sudo /root/.scripts/syncToRackspace.sh
Finally, lets add a cronjob to automate the syncing. For our original backup job we specified @daily, this means mongo will be backed up at 3am everyday. We will run the sync at 6am everyday to ensure it runs after the backup job has been completed. As before open crontab and add the below to the bottom.
$ sudo crontab -e
0 6 * * * /home/root/.scripts/syncToRackspace.sh
That's it we are done. Obviously this is a trivial example and there are more elegant solutions out there. If you wish to have a full system backup for bare metal restore I can highly recommend R1 Soft idera formerly R1 Soft CDP which I have used in production efficiently backing up 50+ remote servers.
source: http://agileand.me/content/backing-your-server-mongodb-cloud
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment