Skip to content

Instantly share code, notes, and snippets.

@efrecon
Last active May 6, 2020 14:38
Show Gist options
  • Save efrecon/6e543fe746b30a74e4be1bcbd51ad0a3 to your computer and use it in GitHub Desktop.
Save efrecon/6e543fe746b30a74e4be1bcbd51ad0a3 to your computer and use it in GitHub Desktop.
date-based tar for backups

Tar-Wrapper for Backups

This is a simple wrapper around tar to manage a number of backups for sources into a destination directory. The destination directory will automatically be created and this wrapper will create files which names contain the time and date of the backup. The resulting tar files will automatically be compressed using gzip compression for maximal compatibility across architectures. The utility is able to execute a command once the backup operation has finalised. One new tar file is created every time tar.sh is run, meaning that it should probably be scheduled on a regular basis from the outside, e.g. with cron.

When the utility's main name is untar instead (or when the option --untar is given), it will look for the latest compressed tar file and unpack into the destination directory instead.

Packing and unpacking operations are by default using checksums to protect against data transfer errors.

Options

tar.sh takes a number of single or double-dashed options, followed by the command to execute once the tar operation has been performed. It is preferable to separate the list of options from the command using a single double-dash, to mark the definitive end of the options.

The following options are recognised.

-d or --destination

The value of this option should be the path to a destination directory that will contain the date-based tar files that are created. The directory will be created if it does not exist.

-s or --source

The value of this option can be the path to a source directory, the default is an empty value, meaning the current directory. Whenever this option is not empty, it will be passed to tar as the directory to change to before performing the tar operation. In practice, this is a relay to the -C option, present in most tar implementations, including the one from busybox.

-p or --prefix

Prefix string to add at the beginning of each tar file name that will be created. This defaults to an empty string. The remaining of the name will be formed of a timestamp in the following format: %Y%m%d%H%M%S and of the compressed ending extension, i.e. .tgz.

-f or --files

The value of this option is a (list of) file(s) or directory/ies to add to the compressed tar file. It cannot be empty, otherwise the utility will end with an error.

-u or --untar

Act in reverse! When this option is given, it will look for the latest compressed tar file starting with the prefix from --prefix in the source directory specified by --source and extract the content of this file, if it exists, to the destination directory. The value of --files is understood in order to only extract some of the files/directories, if necessary.

--tar

This is to force the tar and compress "regular" behaviour, as opposed to --untar.

-c or --check or --sum or --checksum

The content of this option should point to an executable to run for the generation of checksums. By default, this is sha256sum. The name of this program (sans the word sum) will be used as the extension for the file that will contain the checksum and this file will be stored in the same directory. When unpacking, the latest file that have a proper checksum will be used, starting from the youngest one (but see --youngest). Files without checksums or with checksums mismatched will be ignored.

--youngest

When this flag is turned on, the behaviour of the unpacking operations with non-matching checksums will change slightly. Instead of parsing back in time until it finds a proper tar/compressed file, i.e. with a proper checksum, the program will only try with the youngest files of all.

--single

When this file is turned on, only a version of a tar file will be kept. In other words, once a new tar file has been created, if it identical to the previous one that was created, it will automatically be removed, together with its checksum.

Notes

This utility is tuned for maximal compatibility. It only requires a basic POSIX shell and uses tar options that are found in most implementation, including the one from busybox. As a result, tar.sh can easily be used from containers or in embedded systems.

BSD 3-Clause License
Copyright (c) 2019, Emmanuel Frecon <efrecon@gmail.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#!/bin/sh
#set -x
# All (good?) defaults
VERBOSE=0
DESTINATION=.
SRC=
PFX=
PATHS=
# default program to generate sums, can also be sha1sum, md5sum, etc.
SUM=sha256sum
YOUNGEST=0
# Dynamic vars
cmdname=$(basename "$(readlink -f "$0")")
exename=$(basename "$0")
appname=${exename%.*}
if [ "$appname" = "untar" ]; then
UNTAR=1
else
UNTAR=0
fi
# Print usage on stderr and exit
usage() {
exitcode="$1"
cat << USAGE >&2
Description:
$cmdname will tar files/dirs from a source directory to a destination dir.
The destination directory will contain date-based tar files. Execute the
command formed after all options (preferably) separated from options via
a double-dash upon success
Usage:
$cmdname [-option arg --long-option(=)arg] [--] command
where all dash-led options are as follows (long options can be followed by
an equal sign):
-v | --verbose Be more verbose
-d | --destination Path to destination directory (defaults to pwd)
-s | --source Path to source directory
-p | --prefix Prefix to add to destination tar file names
-f | --files (list of) files to pack and compress in tar file
-u | --untar Pick latest from source directory and unpack to
destination. This is the behaviour when the program
is called untar (without ext.)
--tar Force tar behaviour (as opposed to untar)
USAGE
exit "$exitcode"
}
# Parse options
while [ $# -gt 0 ]; do
case "$1" in
-d | --destination)
DESTINATION=$2; shift 2;;
--destination=*)
DESTINATION="${1#*=}"; shift 1;;
-s | --source)
SRC=$2; shift 2;;
--source=*)
SRC="${1#*=}"; shift 1;;
-f | --files)
PATHS=$2; shift 2;;
--files=*)
PATHS="${1#*=}"; shift 1;;
-p | --prefix)
PFX=$2; shift 2;;
--prefix=*)
PFX="${1#*=}"; shift 1;;
-c | --check | --sum | --checksum)
SUM=$2; shift 2;;
--check=* | --sum=* | --checksum=*)
SUM="${1#*=}"; shift 1;;
-u | --untar)
UNTAR=1; shift;;
--tar)
UNTAR=0; shift;;
--youngest)
YOUNGEST=1; shift;;
-v | --verbose)
VERBOSE=1; shift;;
-h | --help)
usage 0;;
--)
shift; break;;
-*)
echo "Unknown option: $1 !" >&2 ; usage 1;;
*)
break;;
esac
done
if [ -z "${PATHS}" ] && [ "$UNTAR" = "0" ]; then
echo "You need to specify sources to tar and compress with --files" >& 2
usage 1
fi
# Conditional logging
log() {
if [ "$VERBOSE" = "1" ]; then
echo "$1" >&2
fi
}
errlog() {
echo "$1" >&2
}
latest() {
log "Looking for latest valid file starting with $1 in $SRC"
ls ${SRC%/}/${1}*.tgz -1 -t | while IFS= read -r latest; do
if [ -n "$latest" ]; then
if [ -n "$SUM" ]; then
# Generate an extension from the sum generating program by
# removing the word "sum" from its name and guess the file
ext=$(basename "$SUM" | sed 's/sum//g')
sumfile="$(echo "$latest" | sed 's/\.tgz$//i').${ext}"
# When we are requested to use checksums, there need to be a
# checksum file and its content need to match the current
# checksum of the file so we will be able to take it into
# account.
if [ -f "$sumfile" ]; then
log "Checking $ext checksum of $latest using $sumfile"
chksum=$(cat "$sumfile")
nowsum=$("$SUM" "$latest" | awk '{print $1}')
if [ "$nowsum" = "$chksum" ]; then
log "Found latest valid file matching $1 at $latest"
echo "$latest"
break
else
if [ "$YOUNGEST" = "1" ]; then
errlog "Checksum mismatch for $latest, aborting!!"
break
else
errlog "Checksum mismatch for $latest, ignoring!!"
fi
fi
else
log "No checksum for $latest, skipping"
fi
else
log "Found latest file matching $1 at $latest"
echo "$latest"
break
fi
fi
done
}
if [ -n "$DESTINATION" ]; then
if [ ! -d "$DESTINATION" ]; then
log "Creating destination directory at $DESTINATION"
mkdir -p "$DESTINATION"
fi
if [ "$UNTAR" = "0" ]; then
log "Archiving and compressing $PATHS in $SRC"
TAROPTS=""
if [ -n "$SRC" ]; then
TAROPTS="-C $SRC"
fi
now=$(date +%Y%m%d%H%M%S)
tarfile="${DESTINATION%/}/${PFX}${now}.tgz"
if [ "$VERBOSE" = "1" ]; then
COPTS="-czvf"
else
COPTS="-czf"
fi
tar $TAROPTS "$COPTS" "${DESTINATION%/}/${PFX}${now}.tgz" $PATHS
if [ -n "$SUM" ]; then
# Generate an extension from the sum generating program by removing
# the word "sum" from its name
ext=$(echo "$(basename "$SUM")" | sed 's/sum//g')
chksum=$("$SUM" "$tarfile" | awk '{print $1}')
if [ -n "$chksum" ]; then
sumfile="${DESTINATION%/}/${PFX}${now}.${ext}"
log "Storing $ext checksum at $sumfile"
echo "$chksum" > "$sumfile"
else
echo "Cannot generate checksum for $tarfile with $SUM!!" >& 2
fi
fi
else
latest=$(latest "$PFX")
if [ -n "$latest" ]; then
log "Unpacking content of $latest to $DESTINATION"
if [ "$VERBOSE" = "1" ]; then
XOPTS="-xzvf"
else
XOPTS="-xzf"
fi
tar -C "$DESTINATION" "$XOPTS" "$latest" $PATHS
fi
fi
fi
if [ $# -ne "0" ]; then
log "Executing $*"
exec "$@"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment