Created
January 24, 2013 09:49
-
-
Save tf198/4619334 to your computer and use it in GitHub Desktop.
Archiver for directories with years of accumulated files you want to clean up. I had a directory with 15 years of projects, many with lots of revisions and helpfully named 'important.old' folders totaling about 20GB and I wanted to start backing it up to offsite. This script pulls out directories, tars and compresses them in a sensible way to a …
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
''' | |
This script will archive a directory and leave a text file | |
in its place with info about the files that have been removed. Allows you to | |
be fairly ruthless in your cleanup safe in the knowledge you can get it back later | |
if required. For example: | |
python archive.py data/projects/old_project -a data/archive | |
will result in two files, the old_projects directory will be deleted | |
data/archive/data_projects_old_project.20130124.tar.bz | |
BZIPed data | |
data/projects/old_project.archive | |
Archived to data/archive/data_projects_old_project.20130124.tar.bz | |
old_project/ | |
old_project/file1 | |
... | |
''' | |
import sys, re, subprocess, datetime, os, argparse | |
# use argparse 'cause it makes it so easy | |
parser = argparse.ArgumentParser(description='Archive directories') | |
parser.add_argument('target', help='Directory or file to archive') | |
parser.add_argument('-a', '--archive', default='~/archive', help='Directory to store archives in') | |
parser.add_argument('-n', action='store_true', dest='noact', help='Just output path information') | |
parser.add_argument('-k', action='store_true', dest='keep', help='Keep target after archiving') | |
options = parser.parse_args() | |
# check our paths | |
if not os.path.isdir(options.archive) or not os.access(options.archive, os.W_OK | os.X_OK): | |
print "Unable to write to %s" % options.archive | |
quit(1) | |
target = options.target.rstrip('/') | |
if not os.access(target, os.R_OK): | |
print "Unable to read from %s" % target | |
quit(1) | |
# create verbose with bzip2 compression | |
opts = '-cvpj' | |
# construct an archive filename <archive>/<safe_target_name>.<date>.tar.bz | |
safe = re.sub(r'[^a-z0-9\._-]', '', target.strip('/').lower().replace('/', '_').replace(' ', '-')) | |
archive = "%s/%s.%s.tar.bz" % ( | |
options.archive.rstrip('/'), | |
safe, | |
datetime.datetime.now().strftime('%Y%m%d%H%M') | |
) | |
# split the target so we dont store the path | |
(working, d) = os.path.split(target) | |
working = working or '.' | |
print "\nTarget: %s" % target | |
print "Archive: %s\n" % archive | |
cmd = ["tar", opts, '-f', archive, '-C', working, d] | |
if options.noact: | |
print "Command: %s" % ' '.join(cmd) | |
print "Taking no action" | |
quit() | |
# run the archive | |
files = subprocess.check_output(cmd) | |
print "Writing info to '%s.archive'" % target | |
with open("%s.archive" % target, 'w') as f: | |
f.write("Archived to %s\n\n" % archive) | |
f.write(files) | |
# delete the target if required | |
if not options.keep: | |
print "Deleting target" | |
subprocess.call(["rm", "-rf", target]) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment