-
-
Save cabal95/e36c06e716d3328b512b to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# | |
BACKUPDEST="$1" | |
DOMAIN="$2" | |
MAXBACKUPS="$3" | |
if [ -z "$BACKUPDEST" -o -z "$DOMAIN" ]; then | |
echo "Usage: ./vm-backup <backup-folder> <domain> [max-backups]" | |
exit 1 | |
fi | |
if [ -z "$MAXBACKUPS" ]; then | |
MAXBACKUPS=6 | |
fi | |
echo "Beginning backup for $DOMAIN" | |
# | |
# Generate the backup path | |
# | |
BACKUPDATE=`date "+%Y-%m-%d.%H%M%S"` | |
BACKUPDOMAIN="$BACKUPDEST/$DOMAIN" | |
BACKUP="$BACKUPDOMAIN/$BACKUPDATE" | |
mkdir -p "$BACKUP" | |
# | |
# Get the list of targets (disks) and the image paths. | |
# | |
TARGETS=`virsh domblklist "$DOMAIN" --details | grep ^file | awk '{print $3}'` | |
IMAGES=`virsh domblklist "$DOMAIN" --details | grep ^file | awk '{print $4}'` | |
# | |
# Create the snapshot. | |
# | |
DISKSPEC="" | |
for t in $TARGETS; do | |
DISKSPEC="$DISKSPEC --diskspec $t,snapshot=external" | |
done | |
virsh snapshot-create-as --domain "$DOMAIN" --name backup --no-metadata \ | |
--atomic --disk-only $DISKSPEC >/dev/null | |
if [ $? -ne 0 ]; then | |
echo "Failed to create snapshot for $DOMAIN" | |
exit 1 | |
fi | |
# | |
# Copy disk images | |
# | |
for t in $IMAGES; do | |
NAME=`basename "$t"` | |
cp "$t" "$BACKUP"/"$NAME" | |
done | |
# | |
# Merge changes back. | |
# | |
BACKUPIMAGES=`virsh domblklist "$DOMAIN" --details | grep ^file | awk '{print $4}'` | |
for t in $TARGETS; do | |
virsh blockcommit "$DOMAIN" "$t" --active --pivot >/dev/null | |
if [ $? -ne 0 ]; then | |
echo "Could not merge changes for disk $t of $DOMAIN. VM may be in invalid state." | |
exit 1 | |
fi | |
done | |
# | |
# Cleanup left over backup images. | |
# | |
for t in $BACKUPIMAGES; do | |
rm -f "$t" | |
done | |
# | |
# Dump the configuration information. | |
# | |
virsh dumpxml "$DOMAIN" >"$BACKUP/$DOMAIN.xml" | |
# | |
# Cleanup older backups. | |
# | |
LIST=`ls -r1 "$BACKUPDOMAIN" | grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}\.[0-9]+$'` | |
i=1 | |
for b in $LIST; do | |
if [ $i -gt "$MAXBACKUPS" ]; then | |
echo "Removing old backup "`basename $b` | |
rm -rf "$b" | |
fi | |
i=$[$i+1] | |
done | |
echo "Finished backup" | |
echo "" |
@Ryushin: i am also using your script since a few months now on production and works very well (Centos 8 with libvirt)
I have one question.. for performance reasons i did not use compression yet for the backups (to have a fast as possible backup) but since we keep one week for every VM, it takes a big of space...
Can i enable compression now afterwards? Or do i have to creat new borg repo?
Good to hear my script it working good for someone else. I spent more time on it than I thought I would. But I have it running in four other installations and it's working well.
I would google the Borg compression question. I would think you could enable compression though. From what I remember though, borg is single threaded. So the compression by borg might greatly add to your time to backup a job. I highly recommend using a file system that supports compression such as ZFS. That is what I do for all my backups. Let ZFS handle the compression. You can change the type and level of compression in ZFS. Plus you know your data won't be affected by bit rot. All my standalone servers run ZFS. The VMs will use ext4 for Linux VMs and NTFS for Windows and since they reside on top of ZFS, all is good.
Hello. You may be able to retrofit my compression method https://gist.github.com/cabal95/e36c06e716d3328b512b#gistcomment-3132817 back into the borg backup script. Although borg's differential backup strategy may not be effective when it doesn’t manage compression.
I used streamed compression specifically so that it would not impact backup duration. Most multiple cpu and/or multicore systems will be io bound during the backup not cpu bound and therefore able to compress the backup without affecting backup duration. Writing compressed data means you are writing fewer bytes to disk which may make the backup faster. Also, if you are later transferring this data across a network it is precompressed saving possible further compression cycles or network bandwidth.
Using a ZFS target, as suggested above, is a good solution assuming that your backups do not then need to be transferred across a network to another storage device.
@Ryushin
Do you think it's worth publishing your script as a separate gist?
I would be glad to fork it to enhance it a bit (create a more robust snapshot with support of QEMU guest agent if possible).
@Ryushin
Do you think it's worth publishing your script as a separate gist?
I would be glad to fork it to enhance it a bit (create a more robust snapshot with support of QEMU guest agent if possible).
I was thinking of doing that for a long time. There are links on the Internet that come here though. Even though this thread is getting very long and detailed. Almost all the scripts are derivatives of the original though. Let me think on it a couple of days. it would also be nice if cabal95 if he thinks these separate scripts should be in their own github/gist site. I also don't want to take away from his work.
By all means, my git knowledge and skills are modest. It has got a lot more mileage than I ever expected. I was lazy in not forking it. Certainly happy to see all the interest. :)
If it is forked it would be good to add a comment here that points to the fork as this thread comes up readily via Google search as Ryushin has elluded.
I am fine either way. To be honest, I've been amazed at how much work people have done on what I originally posted. I no longer using the script myself as I am now using a NAS that has all that VM + backup features built-in. I'm okay with either choice, if you want to fork and create a new gist I can edit the original gist and add a link to the new one so that people don't have to dig down into the comments to find where the latest info is.
I also know QEMU, virsh etc. have come a long way since I originally wrote the script so it might make sense to "start clean" in that sense too.
Yep, sorry for missing some credits, it is a really great example of collaboration.
I spent quite a lot of time trying to find simple and ready to use solution, as it's only for my home NAS and my job and professional skills are in a bit different area. And looks like there's not much in google except for this gist & thread.
What would be great is to create a repo with a set of scripts based on this approach, that can be more extendable and configurable. For example, VM preparing (snapshot creation etc.) and actual backup (cp, rsync, borg, whatever...) can be implemented as separate modules, with the ability to add alternative ones. Having it as a repo would also mean that we can go forward with proper contribution, which is impossible with gist.
I know how open source works. Unfortunately, my bash skills are weak so most likely I won't be that person. Posting it just in case anyone will think it's a good idea to spend some time on it.
@Ryushin: noticed that with the options:
How often to perform full check on borg repositories.
Day of the week to perform the full checK. Use full weekday name (date %A).
CHECK_DOW="Friday"
Which week in the month to perform the full check. Put any number(s) 1-5.
CHECK_WEEKS="12345"
a check will never be performed... any idea why?
Ok... Weekday name must match language ;)
On related note,
SECS=$(printf "%.0f" $(/usr/bin/time -f %e sh -c "$CMD" 2>&1))
ended up creating a bit of manual work on a machine due to language differences:
printf: 891.16: invalid number
which will result in a division by 0 error.
There's probably a smart way to do this, but as a quick fix I'm going with
LC_NUMERIC="en_US.UTF-8" SECS=$(printf "%.0f" $(/usr/bin/time -f %e sh -c "$CMD" 2>&1))
#!/bin/bash
#################### config variables########################
#Uncomment to backup domains myVMa and myVMb
DOMAINS="myVMa myVMb"
#Alternatively list all VMs and back them all up
#DOMAINS=$(virsh list --all | tail -n +3 | awk '{print $2}')
BACKUPROOT=/mnt/remote/backup_vms/$(hostname)
LOG="$BACKUPROOT/logs/qemu-backup.$(date +%Y-%m-%d).log"
SNAPPREFIX=snaptemp-
# Path to XML files for stopped VM...
PATH_LIBVIRT_QEMU="/etc/libvirt/qemu"
#pause before blockcommit
pause_domain_bc=false
#pause after blockcommit fail. if pause_domain_bc is set to true, this option is skiped
pause_domain_after_bc_fail=true
#on fail blockcommit try blockjob
enable_bj_attemts=true
blockjob_retrycount=1
blockjob_delay="30s"
pause_while_bj_attemtps=true
#if blockjob attempts fail, try pause domain? (if pause_while_bj_attemtps is set to true, script ignore this option)
pause_domain_after_bj_fail=true
EMR="email@example.com"
#****************************
#you can create conf file with variables which overwrite those above
#fo example if name of yours script is backup_vms.sh,create file in location wher is your script and name him-> backup_vms.conf
#if you want other name and path, you need to find and change value for variable below (outsite config variables block) named-> confFile
#it is usefull if we work with many hypervisiors and make some changes in script
#****************************
#################### end of config variables#################
SHCMD="$(basename -- $0)"
SHCMD_path=$(dirname $(readlink -f $0))
confFile="$SHCMD_path/${SHCMD%.*}.conf"
if [ -f "$confFile" ]; then
source "$confFile"
echo "overwrite vars from config file"
else
echo "no config file, but if you set variables on begining it is not necessary"
fi
[ ! -f $BACKUPROOT/logs ] && mkdir -p $BACKUPROOT/logs
DATE="$(date +%Y-%m-%d.%H%M%S)"
ERRORED=0
COMMITSNAPTAB[0]=false
BREAK=false
# extract the date coding in FILEFULLNAME (note: filename format must be YYYY-MM-DD)
dtmatch () { sed -n -e 's/.*\(2[0-1][0-9][0-9]-[0-1][0-9]-[0-3][0-9]\).*/\1/p'; }
#check if previous instance of script is not running, it's mandatory for blockcommit after interput script from some reason
if pidof -x "$SHCMD" -o $$ >/dev/null;then
ERR="An another instance of this script is already running, please clear all the sessions of this script before starting a new one
If you are sure to stop previous instance, try command: kill -9 \$(pidof -x \"$SHCMD\")"
echo "$ERR" >> $LOG
echo "$ERR"
exit 1
fi
echo "$SHCMD: Starting backups on $(date +'%d-%m-%Y %H:%M:%S')" >> $LOG
for DOMAIN in $DOMAINS; do
BREAK=false
#check if domain exists
VMSRC=$(virsh list --all | grep [[:space:]]$DOMAIN[[:space:]] | awk '{print $3}')
VMSRC=${#VMSRC}
if [[ $VMSRC -eq 0 ]]; then
ERR="Domain $DOMAIN on this hypervisior not exists"
echo "$ERR" >> $LOG
echo "$ERR"
continue
fi
echo "---- VM Backup start $DOMAIN ---- $(date +'%d-%m-%Y %H:%M:%S')" >> $LOG
VM_RUNNING=1
VMSTATE=$(LC_ALL=en_EN virsh list --all | grep [[:space:]]$DOMAIN[[:space:]] | awk '{print $3}')
echo "$VMSTATE"
if [[ $VMSTATE == "running" || $VMSTATE == "paused" ]]; then
MSG="-> VM $DOMAIN running or paused."
echo "$MSG" >> $LOG
echo "$MSG"
else
MSG="-> VM $DOMAIN not running. No snapshot and blockcommit. Only copy."
echo "$MSG" >> $LOG
echo "$MSG"
VM_RUNNING=0
fi
BACKUPFOLDER=$BACKUPROOT/$DOMAIN
[ ! -d $BACKUPFOLDER ] && mkdir -p $BACKUPFOLDER
TARGETS=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $3}')
IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
# check to make sure the VM is running on a standard image, not
# a snapshot that may be from a backup that previously failed
unset COMMITSNAPTAB[*]
i=0
COMMITSNAP=false
for IMAGE in $IMAGES; do
set -o noglob
if [[ $IMAGE == *${SNAPPREFIX}* ]]; then
set +o noglob
if [[ $VM_RUNNING -eq 0 ]]; then
ERR="$SHCMD: Error VM $DOMAIN is not running but is on a snapshot disk image: $IMAGE"
echo $ERR >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Disk Image: $IMAGE
Domain: $DOMAIN
Command: virsh domblklist $DOMAIN --details" | mail -s "$SHCMD snapshot Exception for $DOMAIN" $EMR
BREAK=true
ERRORED=$(($ERRORED+1))
break
else
COMMITSNAPTAB[$i]=true
COMMITSNAP=true
MSG="VM $DOMAIN is running but is on snapshot image $IMAGE so mark him for blockcommit first"
echo "$MSG" >> $LOG
echo "$MSG"
fi
else
# if not running on snapshot still must check if ther is no tmp file, if is presents, must be deleted before start new snapshot
FILEPATH="${IMAGE%/*}"
FILEFULLNAME=`basename "$IMAGE"`
FILENAME="${FILEFULLNAME%.*}"
#FILEEXT="${FILEFULLNAME#*.}
TMPFILE="$FILEPATH/$FILENAME.$SNAPPREFIX$DOMAIN"
if [ -f "$TMPFILE" ]; then
CMD="rm -f $TMPFILE >> $LOG 2>&1"
MSG=" Deleting temporary image $TMPFILE after making snapshot"
echo "$MSG" >>$LOG
echo "$MSG"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
fi
fi
set +o noglob
i=$i+1
done
[ $BREAK == true ] && continue
if [[ $VM_RUNNING -eq 1 ]]; then
#if vm running on snap, first i need to merge
#i=0
if [[ $COMMITSNAP == true ]]; then
i=0
for TARGET in $TARGETS; do
if [[ ${COMMITSNAPTAB[$i]} == true ]]; then
if [[ $VMSTATE == "paused" ]]; then
pause_enabled=true
else
pause_enabled=false
fi
if [[ $pause_domain_bc == true && $pause_enabled == false ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
fi
MSG="$TARGET BLOCKCOMMIT previous snapshot "
echo "$MSG" >> $LOG
echo "$MSG"
CMD="virsh blockcommit $DOMAIN $TARGET --active --pivot >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="Could not merge changes for disk $TARGET of $DOMAIN with blockcommit. VM may be in an invalid state."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockcommit Exception for $DOMAIN" $EMR
if [[ $pause_domain_after_bc_fail == true && $pause_enabled == false ]]; then
MSG="Try to pause domain $DOMAIN and retry bc"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
MSG="$TARGET BLOCKCOMMIT previous snapshot "
echo "$MSG" >> $LOG
echo "$MSG"
CMD="virsh blockcommit $DOMAIN $TARGET --active --pivot >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="Could not merge changes for disk $TARGET of $DOMAIN (paused) with blockcommit. VM may be in an invalid state."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockcommit Exception for $DOMAIN" $EMR
else
#if bc succes need to disable bj, so set to false
enable_bj_attemts=false
fi
fi
if [ $enable_bj_attemts == true ]; then
if [[ $pause_while_bj_attemtps == true ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
else
if [[ $pause_enabled == true && $VMSTATE == "running" ]]; then
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
fi
for (( j=1; j<=$blockjob_retrycount; j++ )) do
CMD="virsh blockjob --domain $DOMAIN --pivot $TARGET"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="virsh blockjob attempt no $j --domain $DOMAIN --pivot $TARGET failed"
echo "$ERR" >> $LOG
echo "$ERR"
if [[ $j -eq $blockjob_retrycount ]]; then
ERR="all blockjob attempts --domain $DOMAIN --pivot $TARGET failed."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockjob Exception for $DOMAIN" $EMR
if [[ $pause_domain_after_bj_fail == true && $pause_enabled == false && $VMSTATE == "running" ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN"
echo "$ERR" >> $LOG
echo "$ERR"
else
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="can't blockjob --piovot $TARGET anyway, I'm giving up"
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockjob Exception for $DOMAIN" $EMR
echo "$ERR" >> $LOG
echo "$ERR"
BREAK=true
ERRORORED=$(($ERRORED+1))
fi
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
else
BREAK=true
ERRORORED=$(($ERRORED+1))
break
fi
fi
sleep $blockjob_delay
else
MSG="virsh blockjob attempt no $j --domain $DOMAIN --pivot $TARGET success"
echo "$MSG" >> $LOG
echo "$MSG"
break
fi
done
[ $BREAK == true ] && break
fi
fi
if [[ $pause_enabled == true && $VMSTATE == "running" ]]; then
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
fi
i=$i+1
done
[ $BREAK == true ] && continue
#delete tmp images after blockcommit
for IMAGE in $IMAGES; do
set -o noglob
if [[ $IMAGE == *${SNAPPREFIX}* ]]; then
set +o noglob
CMD="rm -f $IMAGE >> $LOG 2>&1"
MSG=" Deleting temporary image $IMAGE"
echo "$MSG" >> $LOG
echo "$MSG"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
fi
set +o noglob
done
#Reload images after blockcommit
IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
fi
DISKSPEC=""
for TARGET in $TARGETS; do
DISKSPEC="$DISKSPEC --diskspec $TARGET,snapshot=external"
done
# transfer the VM to snapshot disk image(s)
CMD="virsh snapshot-create-as --domain $DOMAIN --name ${SNAPPREFIX}$DOMAIN --no-metadata --atomic --disk-only --quiesce $DISKSPEC >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG 2>&1
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
MSG="Error create snapshot with --quiesce mode (it's require qemu-guest-agent), try without"
echo "$MSG" >> $LOG
echo "$MSG"
CMD="virsh snapshot-create-as --domain $DOMAIN --name ${SNAPPREFIX}$DOMAIN --no-metadata --atomic --disk-only $DISKSPEC >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG 2>&1
echo "Command: $CMD"
eval "$CMD"
fi
if [ $? -ne 0 ]; then
ERR="Failed to create snapshot for $DOMAIN"
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD snapshot Exception for $DOMAIN" $EMR
ERRORED=$(($ERRORED+1))
continue
fi
#i=$i+1
fi
####################################
#Rsync #
####################################
MSG="Rsync"
echo "$MSG" >> $LOG
echo "$MSG"
for IMAGE in $IMAGES; do
FILEFULLNAME=`basename "$IMAGE"`
if test -f "$BACKUPFOLDER/$FILEFULLNAME"; then
MSG="Backup exists, merging only changes to image $BACKUPFOLDER/$FILEFULLNAME"
echo "$MSG" >> $LOG
echo "$MSG"
#CMD="rsync -apvhz --inplace --progress $IMAGE $BACKUPFOLDER/$FILEFULLNAME "
CMD="rsync -apvhz --inplace --progress $IMAGE $BACKUPFOLDER/$FILEFULLNAME "
else
MSG="Backup does not exist, creating a full sparse copy of image $IMAGE"
echo "$MSG" >> $LOG
echo "$MSG"
#CMD="rsync -apvhz --sparse --progress $IMAGE $BACKUPFOLDER/$FILEFULLNAME "
CMD="rsync -apvhz --sparse --progress $IMAGE $BACKUPFOLDER/$FILEFULLNAME "
fi
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
done
########################################
# After sync file merge snapshots back #
########################################
if [[ $VM_RUNNING -eq 1 ]]; then
# Update the VM's disk image(s) with any changes recorded in the snapshot
# while the copy process was running. In qemu lingo this is called a "pivot"
BACKUPIMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}')
for TARGET in $TARGETS; do
#if [[ ${COMMITSNAPTAB[$i]} == true ]]; then
#rsync can tak long time and state could changed so check once again
#if state shut there will simple error becouse blockcommi and blockjob can't be done on shutdown domain.
VMSTATE=$(LC_ALL=en_EN virsh list --all | grep [[:space:]]$DOMAIN[[:space:]] | awk '{print $3}')
if [[ $VMSTATE == "paused" ]]; then
pause_enabled=true
else
pause_enabled=false
fi
if [[ $pause_domain_bc == true && $pause_enabled == false ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
fi
MSG="$TARGET BLOCKCOMMIT previous snapshot "
echo "$MSG" >> $LOG
echo "$MSG"
CMD="virsh blockcommit $DOMAIN $TARGET --active --pivot >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="Could not merge changes for disk $TARGET of $DOMAIN with blockcommit. VM may be in an invalid state."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockcommit Exception for $DOMAIN" $EMR
if [[ $pause_domain_after_bc_fail == true && $pause_enabled == false ]]; then
MSG="Try to pause domain $DOMAIN and retry bc"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
MSG="$TARGET BLOCKCOMMIT previous snapshot "
echo "$MSG" >> $LOG
echo "$MSG"
CMD="virsh blockcommit $DOMAIN $TARGET --active --pivot >> $LOG 2>&1"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="Could not merge changes for disk $TARGET of $DOMAIN (paused) with blockcommit. VM may be in an invalid state."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockcommit Exception for $DOMAIN" $EMR
else
#if bc succes need to disable bj, so set to false
enable_bj_attemts=false
fi
fi
if [ $enable_bj_attemts == true ]; then
if [[ $pause_while_bj_attemtps == true ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=true
fi
else
if [[ $pause_enabled == true && $VMSTATE == "running" ]]; then
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
fi
for (( j=1; j<=$blockjob_retrycount; j++ )) do
CMD="virsh blockjob --domain $DOMAIN --pivot $TARGET"
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="virsh blockjob attempt no $j --domain $DOMAIN --pivot $TARGET failed"
echo "$ERR" >> $LOG
echo "$ERR"
if [[ $j -eq $blockjob_retrycount ]]; then
ERR="all blockjob attempt --domain $DOMAIN --pivot $TARGET failed."
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockjob Exception for $DOMAIN" $EMR
if [[ $pause_domain_after_bj_fail == true && $pause_enabled == false && $VMSTATE == "running" ]]; then
MSG="Try to pause domain $DOMAIN first"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh suspend $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't pause domain $DOMAIN"
echo "$ERR" >> $LOG
echo "$ERR"
else
echo "Command: $CMD" >> $LOG
echo "Command: $CMD"
eval "$CMD"
if [ $? -ne 0 ]; then
ERR="can't blockjob --piovot $TARGET anyway, I'm giving up"
echo "$ERR" >> $LOG
echo "$ERR"
echo "$ERR
Host: $HOST
Domain: $DOMAIN
Command: $CMD" | mail -s "$SHCMD blockjob Exception for $DOMAIN" $EMR
echo "$ERR" >> $LOG
echo "$ERR"
BREAK=true
ERRORORED=$(($ERRORED+1))
fi
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
else
BREAK=true
ERRORORED=$(($ERRORED+1))
break
fi
fi
sleep $blockjob_delay
else
MSG="virsh blockjob attempt no $j --domain $DOMAIN --pivot $TARGET success"
echo "$MSG" >> $LOG
echo "$MSG"
break
fi
done
[ $BREAK == true ] && break
fi
fi
if [[ $pause_enabled == true && $VMSTATE == "running" ]]; then
MSG="Resume domain $DOMAIN"
echo "$MSG" >> $LOG
echo "$MSG"
CMDst="virsh resume $DOMAIN"
echo "Command: $CMDst" >> $LOG
echo "Command: $CMDst"
eval "$CMDst"
if [ $? -ne 0 ]; then
ERR="can't resume domain $DOMAIN, continue anyway"
echo "$ERR" >> $LOG
echo "$ERR"
else
pause_enabled=false
fi
fi
#fi
i=$i+1
done
[ $BREAK == true ] && continue
# Now that the VM's disk image(s) have been successfully committed/pivoted to
# back to the main disk image, remove the temporary snapshot image file(s)
for BACKUP in $BACKUPIMAGES; do
set -o noglob
if [[ $BACKUP == *${SNAPPREFIX}* ]]; then
set +o noglob
CMD="rm -f $BACKUP >> $LOG 2>&1"
MSG=" Deleting temporary image $BACKUP"
echo "$MSG" >> $LOG
echo "$MSG"
MSG="Command: $CMD"
echo "$MSG" >> $LOG
echo "$MSG"
eval "$CMD"
fi
set +o noglob
done
# capture the VM's definition in use at the time the backup was done
CMD="virsh dumpxml $DOMAIN > $BACKUPFOLDER/$DOMAIN.xml 2>> $LOG"
MSG="Command: $CMD"
echo "$MSG" >> $LOG
echo "$MSG"
eval "$CMD"
else
# copy the VM's definition
CMD="cp $PATH_LIBVIRT_QEMU/$DOMAIN.xml $BACKUPFOLDER/$DOMAIN.xml 2>> $LOG"
MSG="Command: $CMD"
echo "$MSG" >> $LOG
echo "$MSG"
eval "$CMD"
fi
MSG="---- Backup done $DOMAIN ---- $(date +'%d-%m-%Y %H:%M:%S') ----"
echo "$MSG" >> $LOG
echo "$MSG"
done
MSG="$SHCMD: Finished backups at $(date +'%d-%m-%Y %H:%M:%S')
====================" >> $LOG
echo "$MSG" >> $LOG
echo "$MSG"
exit $ERRORED
my 5 cents, live full and incremental backups of kvm guests
https://gist.github.com/juliyvchirkov/663eb6f5c18600a7414528beee6a7f3a
I guess I need to do some converting to qcow3 to use the tool (at least to full potential). Thank you @juliyvchirkov
Meanwhile, "virsh blockcommit $DOMAIN $TARGET --active --pivot" in the script will most likely cause havoc if one uses base images.
Shouldn't be too difficult to fix to shorten the chain to the original length:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain
Seed image operations also take 0s to complete, so I dropped the division by time line.
I guess I need to do some converting to qcow3 to use the tool (at least to full potential). Thank you @juliyvchirkov
yes, if you want to use the full feature set with incremental backups, you need to convert. Otherwise you can still
use backup mode copy, to do at least a full backup, heres the full documentation:
Hi folks, how can we use vm-backup scripts with Bacula ? I need dir, fd and job files. Thanks.
Hi folks, how can we use vm-backup scripts with Bacula ? I need dir, fd and job files. Thanks.
I use it with Bareos, which is a fork of Bacula. I just call it as a pre script:
RunScript {
RunsWhen = Before
RunsOnClient = yes
Fail Job On Error = yes
Command = "sudo /etc/bareos/scripts/virtual_machine_backups.bash"
}
I use the borg backup script that I created that is posted above.
Thanks @Ryushin I looked at your script and it seems you backup your files with borg. But I want to learn how you arrange Fileset definition. You seem you don't need to backup any files in Fileset. Can we write any folders/directories in Fileset like below:
FileSet {
Name = "LocalhostFiles"
Include {
Options {
signature = MD5
compression=GZIP
}
File = /home
}
}
sorry I am new to Bacula/Bareos :(
Thanks @Ryushin I looked at your script and it seems you backup your files with borg. But I want to learn how you arrange Fileset definition. I am new to Bacula/Bareos :(
Oh boy. Bareos/Bacula has a steep learning curve. It is Enterprise software with a learning curve to go with it. Threre is a lot of configuration needed. Explaining how to set up Bareos goes beyond the scope of this thread. I'll give you my tarball of /etc/bareos:
https://cloud.chrisdos.com/s/2egysaY5S9w2NxZ
Note, my bareos file layout is slightly customized from the new installs from Bareos. Bareos has changed their file structure slightly since when they first released and mine is customized a bit based on the original layout. Also, I back up to hard drives and use vchanger to simulate a virtual tape library.
Notes about this backup script. I exclude backing up the .qcow2 files in /var/lib/libvirt/images. I instead backup my borg repo. The vm-backup script is called by the Bareos Client config as a pre-job script so it is executed before Bareos starts backing up files.
Thanks @Ryushin this is really helpful for me especially for newbiees to bacula/bareos:) I will try bacup with bareos asap :)
Thanks @Ryushin this is really helpful for me especially for newbiees to bacula/bareos:) I will try bacup with bareos asap :)
Don't thank me yet. You might find my setup a bit complicated. I have scripts in bareos/scripts that I call for backups. Along with sending backups and important files to other hosts. Take your time and go through it all. Use my stuff to create your setup. I also call sudo for a few scripts so you will need to create a bareos sudo user to call certain programs.
Hi everyone, did anyone adopt this to backup disks, which are also connected from pools?
Hi guys
Below is my actual time-tested custom console backup solution to implement regular nighlty increment backups of live kvm machines w/o shutting the pool of virtual servers off in a completely automated non-interactive mode
The flow is mostly built upon great cli backup tool virtnbdbackup developed by @abbbi in Python with a number of extra related subroutines
Long story short
- Every night the routine produces 3 fresh backup copies, 2 local and 1 remote. Local backups are handled by dedicated 2Tb nvme
drive, mounted under/backup
point. Remote copies occupy remote rented storage box. - Each vm is supplied with 2 virtual drives -
vda
with OS / data inqcow2
format andraw vdb
utilized as swap partition. The last one is excluded from backups for obvious reasons. - On
1st night
of each month the fresh full backup is created. Then night by night it gets incremented from the live running machines during a month till the next one, when the cycle repeats. - Primary local vm backups are stored under
/backup/storage/kvm
folder, the full path for each month is prepared automatically with this script by the scheme:/backup/storage/kvm/[guest-name]/MM.YYYY
- On
1st night
of a month the step before full backups are created is the only step when machines are shutting down for a short term to maintain their virtualvda
drives. Each drive gets defragmented and the wasted space are reclaimed to speed up the storage for the next month. For safety reasons drives are not defragmented in place, a new defragmented copy for each drive is created under the/tmp/
folder. And only If the image defragmentation tool and another image tool, utilized for post-check, both report zero status, castling is taking place. - As soon as the process is fully automated, to insure vms against epic fail with a plot when despite zero statuses of tools defragmented drive appears broken and rejected by Qemu, the working origin of a drive in a state before the last defragmentaton is not wiped immediately, but instead is stored under
/backup/storage/kvm/dev/block
folder. A few days later, when admin feels completely assure a new copies of drives after defragmentation are working fine, she can drop these obsoleted extra backups by hands. - The fresh defragmented copy of drive from
/tmp/
takes places of the old one which has been moved to/backup/storage/kvm/dev/block
, vm is started and the new full backup is created. - Every night on the next step after backups are created or incremented, the whole tree under
/backup/storage/kvm
gets packed intosquashfs
entity and stored under the name/backup/storage/squashfs/kvm/kvm.sqsh
to be the 2nd local copy in case the 1st one gets in a trouble. - Some time later close to the dawn
lftp
tool (which are not covered by this script since it runs it's own separate plot aimed for remote) will locate this fresh night build ofkvm.sqsh
and transfer it to the remote storage box with other fresh backups insquashfs
containers, thus providing the 3rd remote backup copy in case some evil thing exterminates both local along with live origin
The described flow is implemented night by night by the script below in a completely automated non-interactive mode with neither excess nor complain for over 2 last years, so I suppose for this moment it has been tested live long enough to share for the community with no worry
Please feel free to utilize this stuff for own purposes, to ask questions, to request related to this routine extra info (kinda mentioned above automated non-interactive lftp
plot to populate backups to the remote), to discuss alternatives, or just to say thanks
or hi
Glory to Ukraine! 🇺🇦
Juliy V. Chirkov, https://t.me/juliyvchirkov
#!/usr/bin/env bash
#
# Implements completely automated non-interactive backup routine for live kvm pool
#
# Designed, developed and tested in a long run under Ubuntu 20.04.4 and 22.04.3 at
# 2021-2023 by Juliy V. Chirkov <juliyvchirkov@gmail.com> https://juliyvchirkov.github.io/
# under the MIT licence https://juliyvchirkov.mit-license.org/2021-2023
#
[[ -z "${SHDEBUG}" ]] || set -vx
virtnbdbackupSocket="/var/tmp/virtnbdbackup.sock"
virtnbdbackupBackupType="stream"
virtnbdbackupBackupLevel="inc"
virtnbdbackupExcludeDrives="vdb"
virtnbdbackupWorkers="$(/usr/bin/nproc)"
vmBackupRoot="/backup/storage/kvm"
vmBackupImagesB4Defragment="${vmBackupRoot}/dev/block"
vmBackupSubstorageThisMonth="$(/usr/bin/date +%m.%Y)"
SQUASHFS="/backup/storage/squashfs/kvm"
defragment() {
local kvmImg="$(/usr/bin/virsh domblklist "${1}" | /usr/bin/grep -Po "vda\s+\K\N+")"
local tmpImg="/tmp/${kvmImg##*/}"
local elapsed=0
local restartLibvirtGuests
local restartLibvirtd
/usr/bin/virsh shutdown "${1}" --mode=agent &>/dev/null
while /usr/bin/virsh list | /usr/bin/grep "${1}" -q; do
/usr/bin/sleep 1
elapsed=$(( elapsed + 1 ))
[[ "${elapsed}" -eq 180 ]] && /usr/bin/virsh shutdown "${1}" &>/dev/null
done
if /usr/bin/virt-sparsify --compress "${kvmImg}" "${tmpImg}" &&
/usr/bin/qemu-img check "${tmpImg}" &>/dev/null; then
/usr/bin/virsh checkpoint-delete "${1}" --checkpointname virtnbdbackup.0 --children
/usr/bin/systemctl -q is-active libvirt-guests && {
restartLibvirtGuests=1
/usr/bin/systemctl stop libvirt-guests
}
/usr/bin/systemctl -q is-active libvirtd && {
restartLibvirtd=1
/usr/bin/systemctl stop libvirtd
}
[[ -d "/var/lib/libvirt/qemu/checkpoint/${1}" ]] &&
/usr/bin/rm -rf "/var/lib/libvirt/qemu/checkpoint/${1}"
/usr/bin/chown root:kvm "${tmpImg}"
/usr/bin/mv "${kvmImg}" "${vmBackupImagesB4Defragment}/${1}.${vmBackupSubstorageThisMonth}.vda"
/usr/bin/mv "${tmpImg}" "${kvmImg}"
[[ -z "${restartLibvirtd}" ]] || /usr/bin/systemctl start libvirtd
[[ -z "${restartLibvirtGuests}" ]] || /usr/bin/systemctl start libvirt-guests
fi
/usr/bin/virsh start "${1}"
/usr/bin/sleep 30s
}
[[ -d "${vmBackupRoot}" ]] || /usr/bin/mkdir -pm755 "${vmBackupRoot}"
[[ -d "${SQUASHFS}" ]] || /usr/bin/mkdir -m755 "${SQUASHFS}"
mapfile -t vmlist < <(/usr/bin/virsh list | /usr/bin/grep -oP "[-\d]+\s+\K[^\s]+" || :)
for vm in "${vmlist[@]}"; do
[[ -d "${vmBackupRoot}/${vm}" ]] || /usr/bin/mkdir -m755 "${vmBackupRoot}/${vm}"
[[ -d "${vmBackupRoot}/${vm}/${vmBackupSubstorageThisMonth}" ]] || {
/usr/bin/mkdir -m755 "${vmBackupRoot}/${vm}/${vmBackupSubstorageThisMonth}"
virtnbdbackupBackupLevel="full"
defragment "${vm}"
}
/usr/bin/virtnbdbackup --domain "${vm}" \
--socketfile "${virtnbdbackupSocket}" \
--level "${virtnbdbackupBackupLevel}" \
--type "${virtnbdbackupBackupType}" \
--exclude "${virtnbdbackupExcludeDrives}" \
--worker "${virtnbdbackupWorkers}" \
--output "${vmBackupRoot}/${vm}/${vmBackupSubstorageThisMonth}" \
--compress
done
[[ -f "${SQUASHFS}/kvm.sqsh" ]] && /usr/bin/rm "${SQUASHFS}/kvm.sqsh"
/usr/bin/mksquashfs "${vmBackupRoot}/" "${SQUASHFS}/kvm.sqsh" \
-noI -noD -noF -noX -comp xz -Xbcj x86 -keep-as-directory
@juliyvchirkov thanks for sharing! Just a few notes from my side:
virtnbdbackupWorkers="$(/usr/bin/nproc)"
this doesnt make quite that much sense, virtnbdbackup uses one worker for each disk the virtual machine has attached, to speed up the backup process (process multiple disks in different threads).
So it is always limited to the amount of disks, not to the amounts of cpu's that your host system has. If you use the amount of cpus (which is most probably bigger than the amount of disks the virtual machine has) it will always default to the amount of disks, as the value is bigger ;)). So if you dont want to limit the amount of concurrent disks to process during backup, leave this value to default or limit it to a lower amount of workers if you want to throttle backup process.
virtnbdbackupSocket="/var/tmp/virtnbdbackup.sock"
by default virtnbdbackup uses a process related socket file, that allows to start multiple backups from different virtual
machines at the same time. If you ever want to enhance your script to backup multiple virtual machines at the same time (not sequentially as now) you dont want to set this option to a hardcoded socked.
Is there a reason you all do not use blockcopy? I can backup/clone complete VM's without shutting them down. A snapshot relies on the last backup.
So a mix of a weekly Full Backup and snapshots would be a smart thing right? I have to read up on it, but when I make 6 snapshots and one full backup, I need basically just the fullbackup plus the latest snapshot to restore my machine.
Is there a reason you all do not use blockcopy? I can backup/clone complete VM's without shutting them down. A snapshot relies on the last backup. So a mix of a weekly Full Backup and snapshots would be a smart thing right? I have to read up on it, but when I make 6 snapshots and one full backup, I need basically just the fullbackup plus the latest snapshot to restore my machine.
The point of this script (at least the last time I checked it), is to pivot your live OS to a temporary snapshot file, while it is still booted and running. This frees up the ability to copy the OS virtual disk without risk of data corruption while it is running live. Once the backup of the VM disk is finished, the live snapshot is pivoted back onto the VM disk file, and resumes (or rather, maintains) live running of the OS.
Example of how I automate this script in the following loop and it backs up all the VMs on the system. Note: I did have to make sure permissions were correct on the disk image files so the script would actually back them up, but besides that, it works great! Love it! Thanks for this.
I also rsync the /etc directory to the backup drive so I have a copy of the existing configs. I suppose I should copy the /etc directory into a date-stamped directory with the images, I just haven't written that yet.
#!/bin/bash date date > /etc/date.log BACKUPDIR=/media/usbbackup IMGDIR=$BACKUPDIR/image-backups # However many backups you want to keep... NUMBACKUPS=5 rsync -arp --delete /etc $BACKUPDIR # Selects all domains (on or off) to begin the loop (i.e. VMs with a number: running, or VMs with a dash [-]: stopped.) for VM in $( virsh list --all | awk '( $1 ~ /^[0-9]|-+$/ ) { print $2 }' ) do /usr/local/bin/vm-backup.sh $IMGDIR $VM $NUMBACKUPS #echo "Next" done date exit
Example of how I automate this script in the following loop and it backs up all the VMs on the system. Note: I did have to make sure permissions were correct on the disk image files so the script would actually back them up, but besides that, it works great! Love it! Thanks for this.
I also rsync the /etc directory to the backup drive so I have a copy of the existing configs. I suppose I should copy the /etc directory into a date-stamped directory with the images, I just haven't written that yet.
#!/bin/bash
date
date > /etc/date.logBACKUPDIR=/media/usbbackup
IMGDIR=$BACKUPDIR/image-backupsHowever many backups you want to keep...
NUMBACKUPS=5
rsync -arp --delete /etc $BACKUPDIR
Selects all domains (on or off) to begin the loop (i.e. VMs with a number: running, or VMs with a dash [-]: stopped.)
for VM in $( virsh list --all | awk '(
$1 ~ /^[0-9]|-+$ / ) { print $2 }' )
do
/usr/local/bin/vm-backup.sh $IMGDIR $VM $NUMBACKUPS
#echo "Next"
done
date
exit
Instead of using awk, you can use virsh list --all --name
.
@Ryushin: i am also using your script since a few months now on production and works very well (Centos 8 with libvirt)
I have one question.. for performance reasons i did not use compression yet for the backups (to have a fast as possible backup) but since we keep one week for every VM, it takes a big of space...
Can i enable compression now afterwards? Or do i have to creat new borg repo?