Skip to content

Instantly share code, notes, and snippets.

@drkarl
Last active October 17, 2023 10:43
Star You must be signed in to star a gist
Save drkarl/739a864b3275e901d317 to your computer and use it in GitHub Desktop.
Ask HN: Best Linux server backup system?

Linux Backup Solutions

I've been looking for the best Linux backup system, and also reading lots of HN comments.

Instead of putting pros and cons of every backup system I'll just list some deal-breakers which would disqualify them.

Also I would like that you, the HN community, would add more deal breakers for these or other backup systems if you know some more and at the same time, if you have data to disprove some of the deal-breakers listed here (benchmarks, info about something being true for older releases but is fixed on newer releases), please share it so that I can edit this list accordingly.

  • It has a lot of management overhead and that's a problem if you don't have time for a full time backup administrator.
  • It mainly comprises of using tar for backups which is pretty inflexible by modern standards.
  • The enterprise web interface is OK but it's had so many bugs it's not funny.
  • Backups are very slow.
  • Restores are slow and painful to manage.
  • I haven't found it to be great when trying to integrate with puppet / automation frameworks.
  • Too complex to configure
  • Stores catalog separate from backups, need to backup catalog
  • Doesn't deduplicate
  • Relies on clock accuracy
  • Can't resume an interrupted backup
  • Retention policy
  • Doesn't do encryption
  • File level, not block level deduplication
  • Really slow for large backups (from a benchmark between obnam and attic)
  • To improve performance:
lru-size=1024
upload-queue-size=512

as per: http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-June/003086.html

  • Client side encryption turns off delta differencing
  • Can't purge old backups
  • Doesn't encrypt backups (well, there is encbup)
  • Slow restore performance on large backups? (Sorry Colin aka cperciva)
  • This was a really strong candidate until I read some comments on HN about the slow performance to restore large backups.
  • If this has changed in a recent version or someone has benchmarks to prove or disprove it, it would be really valuable.
  • Slow restore performance on large backups?
  • This was also a really strong candidate until I read some comments on HN about the slow performance to restore large backups.
  • If this has changed in a recent version or someone has benchmarks to prove or disprove it, it would be really valuable.
  • It doesn't do encrypted backups
  • No support for encryption
  • Just included here because I knew someone would mention it in the comments. It's Mac OS X only. This list is for Linux server backup systems.

Other contenders (of which I don't have references or information):

Also Tarsnap scores really high on encryption and deduplication but it has 3 important cons:

  • Not having control of the server where your backups are stored
  • Bandwith costs make your costs unpredictable
  • The so called Colin-Percival-gets-hit-by-a-bus scenario

Attic has some really good comments on HN and good blog posts, doesn't have any particular deal-breaker (for now, if you have one please share with us), so for now is the most promising.

Roll your own

Some HN users have posted the simple script they use. The scripts usually use a combination of

mikhailian's script

FROM=/etc
TO=/var/backups
LINKTO=--link-dest=$TO/`/usr/bin/basename $FROM`.1
OPTS="-a --delete -delete-excluded"
NUMBER_OF_BACKUPS=8

find $TO -maxdepth 1 -type d -name "`basename $FROM`.[0-9]"| sort -rn| while read dir
do
        this=`expr match "$dir" '.*\([0-9]\)'`; 
        let next=($this+1)%$NUMBER_OF_BACKUPS;
        basedirname=${dir%.[0-9]}
        if [ $next -eq 0 ] ; then
                 rm -rf $dir
        else
                 mv $dir $basedirname.$next
        fi
done
rsync $OPTS $LINKTO $FROM/ $TO/`/usr/bin/basename $FROM.0`

zx2c4's script

zx2c4@thinkpad ~ $ cat Projects/remote-backup.sh 
    #!/bin/sh
    
    cd "$(readlink -f "$(dirname "$0")")"
    
    if [ $UID -ne 0 ]; then
            echo "You must be root."
            exit 1
    fi
    
    umount() {
            if ! /bin/umount "$1"; then
                    sleep 5
                    if ! /bin/umount "$1"; then
                            sleep 10
                            /bin/umount "$1"
                    fi
            fi
    }
    
    unwind() {
            echo "[-] ERROR: unwinding and quitting."
            sleep 3
            trace sync
            trace umount /mnt/mybackupserver-backup
            trace cryptsetup luksClose mybackupserver-backup || { sleep 5; trace cryptsetup luksClose mybackupserver-backup; }
            trace iscsiadm -m node -U all
            trace kill %1
            exit 1
    }
    
    trace() {
            echo "[+] $@"
            "$@"
    }
    
    RSYNC_OPTS="-i -rlptgoXDHxv --delete-excluded --delete --progress $RSYNC_OPTS"
    
    trap unwind INT TERM
    trace modprobe libiscsi
    trace modprobe scsi_transport_iscsi
    trace modprobe iscsi_tcp
    iscsid -f &
    sleep 1
    trace iscsiadm -m discovery -t st -p mybackupserver.somehost.somewere -P 1 -l
    sleep 5
    trace cryptsetup --key-file /etc/dmcrypt/backup-mybackupserver-key luksOpen /dev/disk/by-uuid/10a126a2-c991-49fc-89bf-8d621a73dd36 mybackupserver-backup || unwind
    trace fsck -a /dev/mapper/mybackupserver-backup || unwind
    trace mount -v /dev/mapper/mybackupserver-backup /mnt/mybackupserver-backup || unwind
    trace rsync $RSYNC_OPTS --exclude=/usr/portage/distfiles --exclude=/home/zx2c4/.cache --exclude=/var/tmp / /mnt/mybackupserver-backup/root || unwind
    trace rsync $RSYNC_OPTS /mnt/storage/Archives/ /mnt/mybackupserver-backup/archives || unwind
    trace sync
    trace umount /mnt/mybackupserver-backup
    trace cryptsetup luksClose mybackupserver-backup
    trace iscsiadm -m node -U all
    trace kill %1

pwenzel suggests

  rm -rf backup.3
  mv backup.2 backup.3
  mv backup.1 backup.2
  cp -al backup.0 backup.1
  rsync -a --delete source_directory/  backup.0/

and https://gist.github.com/ei-grad/7610406

Meta-backup solutions (which use several backup solutions)

@sknebel
Copy link

sknebel commented Mar 16, 2015

HN discussion: https://news.ycombinator.com/item?id=9210505 (for people stumbling over the gist later/rediscovering it)

@Firefishy
Copy link

Also: https://github.com/zbackup/zbackup (Dedup, optional encryption. Active Development)

Copy link

ghost commented Mar 16, 2015

Seconding rsnapshot -- https://github.com/rsnapshot/rsnapshot

And for putting a copy of that in the cloud I use tarsnap.

@biohazd
Copy link

biohazd commented Mar 16, 2015

personally, I use rsnapshot: http://www.rsnapshot.org/

@mikhailian
Copy link

I've recently followed a presentation of restic by its author. It is amazingly fast and has deduplication and encryption built in.

However, I prefer this little script above else. It stores up to 9 versions, but you can push it to store 99 with a bit of tweaking ;-)

FROM=/etc
TO=/var/backups
LINKTO=--link-dest=$TO/`/usr/bin/basename $FROM`.1
OPTS="-a --delete -delete-excluded"
NUMBER_OF_BACKUPS=8

find $TO -maxdepth 1 -type d -name "`basename $FROM`.[0-9]"| sort -rn| while read dir
do
        this=`expr match "$dir" '.*\([0-9]\)'`; 
        let next=($this+1)%$NUMBER_OF_BACKUPS;
        basedirname=${dir%.[0-9]}
        if [ $next -eq 0 ] ; then
                 rm -rf $dir
        else
                 mv $dir $basedirname.$next
        fi
done
rsync $OPTS $LINKTO $FROM/ $TO/`/usr/bin/basename $FROM.0`

@toddsiegel
Copy link

From https://github.com/restic/restic: "WARNING: At the moment, consider restic as alpha quality software, it is not yet finished. Do not use it for real data!"

@drkarl
Copy link
Author

drkarl commented Mar 16, 2015

@mrcrilly I added markup formatting, and also updated with some new content.

@gyoza
Copy link

gyoza commented Mar 16, 2015

Surprised this isn't on the list..

http://dar.linux.free.fr/

Pretty good, does baselines, diffs, incrementals. Pretty decent software. 10 years old.

@antitux
Copy link

antitux commented Mar 16, 2015

lvm snapshots possible?

@shulegaa
Copy link

As of early 2015, Mondo Rescue v3.2.x seems to have managed to cope with the uncontrolled morass of 'systemd' dependencies (and systemd's inscrutable, binary config files and so on).  I haven't tried it (yet).  Before systemd, Mondo Rescue was a a remarkably powerful and easy-to-use (full-system-image) backup (and disaster recovery from bootable device/CD/DVD) tool.  It should be worth a try ;-)

http://www.mondorescue.org/

@benjamir
Copy link

  • Bacula has file based deduplication
  • BackupPC up to v3 doesn't encrypt the pool of the deduplicated files (FDE can save you from off-line access anyway), but you can easily configure an archive host where you put encrypted tarballs (you can hook in with scripts at any[?] stage of the backup). e.g. use an on-site BackupPC setup which puts the pool on a encrypted partition/container/folder/etc. and use off-site storage as archive hosts.

Are you aware that your definition of your use case "the best Linux backup" opens the flood gates for bike shedding comments?

My advice: start with a (superficially) easy solution and try it out; read or at least skim and tag the mailing list of that software often.

@seidler2547
Copy link

I use duplicity for backing up my server. Restoring has been unproblematic for me.

I like duplicity because

  • it had asymmetric encryption, meaning that I don't need to leave the decryptions keys on the server, it just encrypts using the public key
  • it has a good number of backends, including Google Drive (one of the cheapest storage options < 200GB) and Amazon S3

@eAndrius
Copy link

Had the same need, enddded up adapting encrb for personal use-case: https://github.com/eAndrius/encrb

@derekp7
Copy link

derekp7 commented Mar 18, 2015

@drkarl, is lack of encryption the only deal breaker for Snebu? Or is it the primary deal breaker? (You may want to add "file level, not block level deduplication", as this impacts backing up VM images, although an add-on that specifically address VMs is in the works). If encryption is the main issue, I've put together a plan to address this without compromising some of the other features (such as minimal client-side requirements) -- I should be able to code it up this weekend.

@gam-phon
Copy link

rsnapshot is in our production servers.

@drkarl
Copy link
Author

drkarl commented Mar 22, 2015

@Vincent14
Copy link

I use BackInTime (in the debian repo) : it's a graphical software based on inodes increments (physical links) for incrementials backups

@amarendra
Copy link

Backupninja didn't support Attic the last time I checked. Also not sure whether Attic does block based de-duplication or it's just file level. The former would make it a killer - that's one good strong point Tarsnap has that is missing in many backup clients.

@redacom
Copy link

redacom commented Sep 23, 2015

I think we will never find the "best" backup tool as everyone has different needs. For example, I do not want to compress backups as they are very big and takes a lot of time and resources, while for others, compression is a must.

My crontribution are Fwbackups which is an easy tool to backup both locally and remotelly (compressing if needed)

Bera Backup another open source tool to backup files/folders but also configurations (crontabs, users, system config...) to easily replicate a system.

Amanda (stated above) is probably the best tool, very powerful but also more complex than the others...

@sammcj
Copy link

sammcj commented Jan 5, 2016

Has anyone found anything else recently to add to the list?

We're still pretty keen to replace Amanda backup with something, we've had so many problems with it again recently, it'd be great if there was a simple web interface that provided functionality similar to that of ninjahelper but also gave a breakdown of backup timelines and offered restore options.

@sadid
Copy link

sadid commented Jan 10, 2016

I had some experience with backup tools and I came finally to these tools:
Zpaq, attic, Dar and Duplicity (Deja-Dup) (Obnam, Zbackup and bup was considerable but I dismess them after a while). The way bup handles deduplication is very interesting but I didn't test it since it had no encryption at the time.

As far as I remember:
The best regarding Simplicity and Ease of Use: Deja-Dup
feature rich: attic
and Fast and compression/deduplication effecient: Zpaq

Currently I'm just using Zpaq, Deja-Dup(Duplicity) and one archive with attic. Attic is slow in comparison to Dar and Zpaq here are my benchmark (I found it finally):
attic on HugeRepo with 120+GB takes ~5.5Hr and 100 GB final deduplicated backup
attic on MediumRepo with near 26GB takes ~1Hr and 13GB final deduplicated backup
attic on TinyRepo with near 4GB takes 19min and 1.94GB final deduplicated backup
zpaq on MediumRepo with 26GB takes 1300Sec and 12.2GB
zpaq on TinyRepo with near 4GB takes 170Sec and 1.7GB
(I'm not sure about parameters of each command)

@romiras
Copy link

romiras commented Jul 2, 2016

I use ZBackup regularly. It's my favorite backup tool ever since I found it.

Typical scenario is
zip -r0 - some_dir | zbackup backup /path/to/zbackup-repo/backups/filename.zip
or
tar -c some_dir | zbackup backup /path/to/zbackup-repo/backups/filename.tar
or
zcat somefile.tar.gz | zbackup backup /path/to/zbackup-repo/backups/filename.tar
or
cat some_raw_file | zbackup backup /path/to/zbackup-repo/backups/filename.tar

@lestercheung
Copy link

My plan that works for small to mid size groups:

  • Use ZFS with stock Ubuntu (preferred) or BTRFS for filesystem snapshots.
  • Automate filesystem snapshots creation and removal (zfsnap)
  • burp for off-machine backup!

@ThomasWaldmann
Copy link

Replace attic (development stopped 1.5y ago) with borgbackup (== fork of attic + lots of fixes and enhancements).

@xenithorb
Copy link

Thanks @ThomasWaldmann, I made it to the end of the comments just to look for updates. As now it's 2017 a lof of the solutions mentioned here are very out of date and their lack of development is prevalent. (Almost half are no longer being maintained, it seems.) Some like attic haven't had any new commits in years, unfortunately.

Any more updates are appreciated. (Comments about stable software not needing commits and such notwithstanding, the focus is to trust in something that will continue to work)

@DJsupermix
Copy link

Well, it's all about security issues we're now facing in the modern world. We tested Bacula's solutions, just like those https://www.baculasystems.com/enterprise-backup-solution-with-bacula-systems/easy-and-scalable-windows-backup though were neither satisfied, nor dissatisfied as there still have not been failures, luckily. Would be interested in getting feedback if someone used their products, want to be sure that everything would be OK on the X day.

@markfox1
Copy link

Thanks for the census @drkarl. For our Windows workstations, we settled on Duplicati, which uses a block-based deduplication algorithm to allow incremental backups to local, remote, or cloud object storage indefinitely. So the first backup is the big one, but all backups are incremental from there. It is open source and runs on the major unices. We are experimenting with it under Linux, and it does feel a bit weird running a C# program under Linux, but until someone writes an open source program with similar abilities to hashbackup, it seems to be the only game in town.

@tomwaldnz
Copy link

tomwaldnz commented Jun 27, 2018

A problem that I've found with Attic / Borg Backup (Borg is a fork that's more actively maintained) is when you run it the old backup file is either renamed or deleted, and a new backup file created. This means if you're using metered storage or bandwidth (eg Amazon S3) you'll get charged more - you effectively upload all your data every night. Someone else found the same thing, here, but they found this behavior changes for very large backups - but I don't know what the threshold is.

Duplicati has a lot of potential. It's just come out of years of alpha testing, and is now a beta. I found an issue a year ago that prevented restores of the data when a non-standard block size was being used - which I did because I was backing up large video files. This is a logged bug, which hasn't been fixed yet. The author thinks it's something that's easy to work around, but any bug that prevents restores is a big red flag for me.

@rsyncnet
Copy link

rsyncnet commented Jul 2, 2019

I hope that it is interesting and valuable to point out that rsync.net supports, among other things:

  • rclone
  • restic
  • borg / attic

... which is to say that an rsync.net cloud storage account is stock/standard OpenSSH, with SFTP/SCP, so borg and restic "just work" using the SFTP transport. Further, we installed and maintain an rclone binary on our platform[1][2] which means we're not just an rclone SFTP target, but you can execute rclone and run it from an rsync.net account (to go fetch from gdrive or S3 or whatever).

And, of course, any other thing that runs over SSH/SFTP/SCP (from rsync to filezilla) will work.

The entire platform is ZFS so you have protection from corruption as well as optional snapshots.[3]

[1] rclone/rclone#3254

[2] https://twitter.com/njcw/status/1144534055817502721

[3] https://www.rsync.net/platform.html

@per2jensen
Copy link

Just stumbled upon this thread.

I will second http://dar.linux.free.fr/ - it is mature, matintained and work extremely well in many use cases.

/Per

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment