Automated backup management.
Most of the software listed below can and should be installed via a package manager such as yum
or aptitude
.
Similarly, Python modules can and should be installed via easy_install
or preferably pip
.
-
Python (version >= 2.6)
-
Django (version >= 1.4)
https://www.djangoproject.com/download/
tar xzvf Django-1.4.5.tar.gz cd Django-1.4.5 sudo python setup.py install
-
Apache (version >= 2.2.14)
Any one of the following databases that Django supports (MySQL recommended):
- MySQL
- SQLite
- PostgreSQL
- Oracle
Apache Modules
- mod_wsgi
Python Modules
- MySQLdb (or the module corresponding to the database chosen)
- pexpect
Download and extract the files into a directory to which the apache user has read, write and execute permissions or preferably, simply git clone git://github.com/ltiao/autobackup.git
into such a directory.
This directory is the root of the Django project and shall be referred to as such throughout this document.
For example:
cd /home
git clone git://github.com/ltiao/autobackup.git
This will create a directory in home
named autobackup
(the root) with all the data from this repository.
Create a database. E.g. in MySQL:
CREATE DATABASE autobackup CHARACTER SET utf8 COLLATE utf8_general_ci;
Now we need to edit the database settings in <project root>/autobackup/settings/production_settings.py
E.g.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql', # Add 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
'NAME': 'autobackup', # Or path to database file if using sqlite3.
'USER': '', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'HOST': '', # Set to empty string for localhost. Not used with sqlite3.
'PORT': '', # Set to empty string for default. Not used with sqlite3.
},
}
Here, USER
and PASSWORD
is all that is missing.
Once configured, we can proceed to create the model tables by simply running
python manage.py syncdb
from the project root.
This will run all the SQL commands shown in python manage.py sqlall task
.
Lastly, we just need to setup Apache
and mod_wsgi
by adding the following lines to httpd.conf
.
Alias /static/ /<project-root>/static/
<Directory /<project-root>/static>
Order deny,allow
Allow from all
</Directory>
WSGIScriptAlias /autobackup /<project-root>/autobackup/wsgi.py
<Directory /<project-root>/autobackup>
<Files wsgi.py>
Order deny,allow
Allow from all
</Files>
</Directory>
Restart apache. If the installation was successful, you should now be able to login to the autobackup web interface at http://localhost/autobackup/.
The core autobackup script is located in /<project-root>/script/backup.py
Usage: backup.py [options] [GROUP]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-g GROUP, --backup-group=GROUP
Execute backup tasks under groups associated with the
supplied group abbreviation name
-c, --clean Delete old backup files
-m, --send-mail Send details of completed backup tasks by email
The supplied backup-group argument will be used to query Groups against the abbreviation
field (autobackup.task_group.abbreviation
) and execute the enabled backup Tasks under the matching Group(s).
Tasks are executed by first creating instances of a concrete subclass in functions.py
as specified by the function_name
of the Function related to the Task.
By way of example, a function with function_name=A
would cause an object A
to be instantiated if such a class exists in functions.py
. If not, an AttributeError
is raised.
The actual execution of the backup task is now done by calling the execute
method of the subclass instance which will perform the backup task as a seperate thread. In this way many backup tasks can be performed simultaneously.
The rationale behind wrapping the actual functions that perform the backup in classes is the utilization of the Template Method Pattern.
Obviously, many devices will have the same backup function, but some will have subtle differences.
Fundamentally, all backup tasks are the same. We do some sort of setup step which may involve authentication, directory creation, etc. and then we do the real work. During the process, we'd like to be able to report the progess and outcome of the backup, and if successful, store the path of the backed up file.
These actions are all encapsulated in the AbstractBackup
class. With this, it becomes extremely easy to extend the behaviour of existing functions and create new ones.
By way of example, we notice that the command necessary to backup Cisco Switch running configurations are very similar to that of Cisco Firewalls, the only difference lies in sequence of command for pushing the configuration file to the TFTP server. So we can simply have the Cisco firewall backup class subclass the Cisco switch backup (in this case, we really should have an Abstract Cisco backup class that is subclassed by Switches and Firewall backups).
The following is an example of a new backup task that backups up Yahoo! Finance news headlines to a textfile.
class YahooBackup(AbstractBackup):
def cleanup(self):
pass # It is usually not necessary to clean up after
def setup(self):
# Import modules
import urllib2, pprint, json, unicodedata
# Prepare the data
self.result = json.loads(urllib2.urlopen("http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Ffinance.yahoo.com%2Fq%3Fs%3Dyhoo%22%20and%20xpath%3D'%2F%2Fdiv%5B%40id%3D%22yfi_headlines%22%5D%2Fdiv%5B2%5D%2Ful%2Fli%2Fa'&diagnostics=true&format=json").read())
# Create the destination directory
self.path = os.path.join(BACKUP_BASE_DIRECTORY, 'news', datetime.datetime.now().strftime("%Y/%m/%d"))
if not os.path.exists (self.path):
logger.info('Daily backup directory [{0}] does not exists. Creating now...'.format(self.path))
os.makedirs (self.path)
def perform_backup(self):
# Parse the data to get the headlines
data = u''
for headline in self.result['query']['results']['a']:
data += u'*\tHeadline: {content}\n\tLink: {href}\n'.format(**headline)
# Save to the destination directory
dump_filename = os.path.join(self.path, 'yahoo_finance_news.txt')
try:
f = open( dump_filename, 'w' )
f.write(data.encode('utf8'))
except Exception as e:
# We can save any exceptions that occured so it can later be read from the web interface
self.messages.append('Could not write to {0}: {1}'.format(dump_filename, e))
logger.exception('Could not write to {0}'.format(dump_filename))
else:
# If no exceptions occured, we just modify the success flag to True and save the path to the resulting backup file.
self.successful = True
self.result_file_abs_path = dump_filename
Line such as these
self.successful = True
self.result_file_abs_path = dump_filename
are necessary for saving the result of the backup to the database for the web interface. If they are not used when the backup task was successful, the backup task will be reported up as being unsuccessful in the web interface.
The Event being created after a backup task has finished executing:
# Write the backup event to database. This must occur regardless of the backup's outcome.
Event.objects.create(name=os.getpid(), messages='\n'.join(self.messages), backup_successful=self.successful, backup_file_path=self.result_file_abs_path, task=self.task)
The Event model has a timestamp field called created
which is automatically given the current time so all the web interface has to do is display all events for a given day.
The backup.py --clean
conmmand retrieves all the events that are older than 6 months (or whatever value settings.BACKUP_TTL
is), that is, all the backups that were perfomed and completed 6 months ago, and deleted the backup file as specified by backup_file_path
if it still exists, and deletes the event itself.
Users belonging to the user group autobackup
will be suscribed to the mailing list and notified of outcome of backup tasks so far for the day it is being run. Once notified, the task's mailed
flag will be set to True
and will not be shown in future notifications on that day.
This way, mail notifications can be scheduled to run several times a day.
To add a recipient to the email notifications, simply add a user to the group autobackup
.
Settings for the SMTP server can be modified in //autobackup/settings/production_settings.py
See https://docs.djangoproject.com/en/dev/ref/settings/#email-backend
Copyright © 2013 Louis Tiao