Skip to content

Instantly share code, notes, and snippets.

@palozano
Last active June 13, 2023 15:50
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save palozano/63b6c0592a2c01a5e2c9b308365ac2db to your computer and use it in GitHub Desktop.
Save palozano/63b6c0592a2c01a5e2c9b308365ac2db to your computer and use it in GitHub Desktop.
Rsync usage

Notes on using rsync

First, install it, using apt, yum, pacman, etc.

Local usage

Imagine we want to back up files from Directory1 to Directory2, and both are on the same hard drive (this would work the same if the directories are on two different drives). There are several different ways to do this, depending on what kind of backups (i.e., options you want to give rsync) you want to configure. The most general, will be this:

$ rsync -av --delete /Directory1/ /Directory2/

The code will sync the contents of Directory1 to Directory2, leaving no differences between the two. If rsync finds that Directory2 has a file that Directory1 does not, it will delete it. If rsync finds a file that has been changed, created, or deleted in Directory1, it will reflect those same changes to Directory2.

The options passed to rsync are the following (more can be found at the end):

  1. -a = archive. Means several things: recursive (recurse into directories), links (copy symlinks as symlinks), perms (preserve permissions), times (preserve modification times), group (preserve group), owner (preserve owner), preserve device files, and preserve special files.
  2. -v = verbose. You can see what rsync is backing up.
  3. delete = Deletes any files in Directory2 that aren’t in Directory1.

If you don't want to make a perfect copy/backup of the folder, but instead add everything that rsync finds in the source folder, don't use the --delete option.

External usage

Different ways of using rysnc for external backups are possible. The easiest and most secure method is tunneling rsync through SSH. Most servers and even many clients already have SSH, and it can be used for your rsync backups. The process is described for machines on a local network, but this would be the exact same one if one host were out on the internet somewhere (note that port 22, or whatever port you have SSH configured on, would need to be forwarded on any network equipment on the server’s side of things).

You need ssh installed in the computer.

(Note: you only need rsync and ssh, setup the repositories on the server where you would like the files backed up, and make sure that SSH is locked down. Make sure the user you plan on using has a complex password, and it may also be a good idea to switch the port that SSH listens on--default is 22.)

The command is the same, but a few additionas are needed. For user “john” connecting to “192.168.235.137” and using the same options as above (-av –delete) we run the following:

$ rsync -av –delete -e ssh /Directory1/ john@192.168.235.137:/Directory2/

If you have changed the port (maybe to '12345'), take that into account when calling ssh:

$ rsync -av –delete -e 'ssh -p 12345' /Directory1/ john@192.168.235.137:/Directory2/

Automating

Cron can be used to automate the execution of commands on the background. Using it, we can run nightly backups, or however often you would like them to run.

Firstly, you have to edit the cron table:

$ crontab -e

Its syntax its a bit messy (man crontab does not help much, so check www.crontab.guru or other webpage), for example:

  1. To run rsync command every night at 10 PM
0 22 * * * rsync -av --delete /Directory1/ /Directory2/

The first “0” specifies the minute of the hour, and “22” specifies 10 PM. Since we want this command to run daily, we will leave the rest of the fields with asterisks and then paste the rsync command.

Or,

15 3 * * 1-5 rync -av /Users/john/Documents /Volumes/External_HDD/office_backup/

to run a nigthly backup, at 3:15am, inside an external hard-drive that is connected to the computer.

Or,

15 3 * * 1-5 rync -av -e ssh /Users/john/Work alias_office:/Volumes/External_HDD/office_backup/

to run a nightly backup over ssh to an external hard-drive connected yo your office (you have created an alias for you office in .ssh/config file)

To check which cron jobs have been scheduled in crontab, you can type crontab -l. If you want to delete all of them, crontab -r.

Extra

I usually work with a desktop and a laptop, with the former having an external hard-drive connected to it, where I do all my backups. The desktop has a cron jobs running in the background and making nightly copies of certain folders. I also want to backup the laptop from home, so I created an alias in my .bashrc file, to backup with rsync over ssh.

To create an alias in you .bash_profile or .bashrc file, the syntax is as follows:

alias backup_docs="rsync -av -e ssh ~/Documents/Folder1 alias_office:/Volumes/External_HDD/laptop_backup/"

You can even schedule you laptop to wake up at night, do the back up and then turn off again.

Creating a monthly (compressed) backup

I want to keep a monthly backup in case a disaster occurs and I have to restore. I want this to not occupy much space, so I will create a compressed file. To do this, I will use the tar command.

The tar command needs a few options:

  • c – Creates a new .tar archive file.
  • v – Verbosely show the .tar file progress.
  • f – File name type of the archive file.

I want the highest compression, so I will be using the bz2 feature, which compress and create archive files less than the size of the gzip, in contrast compression takes more time to compress and decompress files. To create highly compressed tar file, use option as -j.

My idea is to make a backup, every month, but not overwrite what I had before. This is specially important if you use the --delete flag when using rsync. You can name the file monthly_$(date +%Y%m%d).tar.bz2 where the bash regular expression will introduce the year, month and day in the name of the file.

You can create a cron job that will run every first day of the month, at 4:30am, using the following:

30 4 1 * * tar -cvjf /Volumes/External_HDD/office_backup/monthly/monthly_$(date +%Y%m%d).tar.bz2 /Volumes/External_HDD/office_backup/daily

(Note: the tar command has a different syntax with respect rsync: you have to put first the "destination" (which is the file to be created) and then the folder to be compressed.

Restoring a backup

This is the same process, but you have to decompress the backup file and copy it to the destination. You don't need a cron job to this (or I don't think so, but maybe you do).

$ rsync -aAXv --delete --exclude="lost+found" where/you/keep/your/backup where/your/backup/should/be

The options and the excluded files are different for my hard-drive configuration.


Rsync options:

    -v, --verbose               increase verbosity
    -q, --quiet                 suppress non-error messages
        --no-motd               suppress daemon-mode MOTD (see caveat)
    -c, --checksum              skip based on checksum, not mod-time & size
    -a, --archive               archive mode; equals -rlptgoD (no -H,-A,-X)
        --no-OPTION             turn off an implied OPTION (e.g. --no-D)
    -r, --recursive             recurse into directories
    -R, --relative              use relative path names
        --no-implied-dirs       don’t send implied dirs with --relative
    -b, --backup                make backups (see --suffix & --backup-dir)
        --backup-dir=DIR        make backups into hierarchy based in DIR
        --suffix=SUFFIX         backup suffix (default ~ w/o --backup-dir)
    -u, --update                skip files that are newer on the receiver
        --inplace               update destination files in-place
        --append                append data onto shorter files
        --append-verify         --append w/old data in file checksum
    -d, --dirs                  transfer directories without recursing
    -l, --links                 copy symlinks as symlinks
    -L, --copy-links            transform symlink into referent file/dir
        --copy-unsafe-links     only "unsafe" symlinks are transformed
        --safe-links            ignore symlinks that point outside the tree
    -k, --copy-dirlinks         transform symlink to dir into referent dir
    -K, --keep-dirlinks         treat symlinked dir on receiver as dir
    -H, --hard-links            preserve hard links
    -p, --perms                 preserve permissions
    -E, --executability         preserve executability
        --chmod=CHMOD           affect file and/or directory permissions
    -A, --acls                  preserve ACLs (implies -p)
    -X, --xattrs                preserve extended attributes
    -o, --owner                 preserve owner (super-user only)
    -g, --group                 preserve group
        --devices               preserve device files (super-user only)
        --specials              preserve special files
    -D                          same as --devices --specials
    -t, --times                 preserve modification times
    -O, --omit-dir-times        omit directories from --times
        --super                 receiver attempts super-user activities
        --fake-super            store/recover privileged attrs using xattrs
    -S, --sparse                handle sparse files efficiently
    -n, --dry-run               perform a trial run with no changes made
    -W, --whole-file            copy files whole (w/o delta-xfer algorithm)
    -x, --one-file-system       don’t cross filesystem boundaries
    -B, --block-size=SIZE       force a fixed checksum block-size
    -e, --rsh=COMMAND           specify the remote shell to use
        --rsync-path=PROGRAM    specify the rsync to run on remote machine
        --existing              skip creating new files on receiver
        --ignore-existing       skip updating files that exist on receiver
        --remove-source-files   sender removes synchronized files (non-dir)
        --del                   an alias for --delete-during
        --delete                delete extraneous files from dest dirs
        --delete-before         receiver deletes before transfer (default)
        --delete-during         receiver deletes during xfer, not before
        --delete-delay          find deletions during, delete after
        --delete-after          receiver deletes after transfer, not before
        --delete-excluded       also delete excluded files from dest dirs
        --ignore-errors         delete even if there are I/O errors
        --force                 force deletion of dirs even if not empty
        --max-delete=NUM        don’t delete more than NUM files
        --max-size=SIZE         don’t transfer any file larger than SIZE
        --min-size=SIZE         don’t transfer any file smaller than SIZE
        --partial               keep partially transferred files
        --partial-dir=DIR       put a partially transferred file into DIR
        --delay-updates         put all updated files into place at end
    -m, --prune-empty-dirs      prune empty directory chains from file-list
        --numeric-ids           don’t map uid/gid values by user/group name
        --timeout=SECONDS       set I/O timeout in seconds
        --contimeout=SECONDS    set daemon connection timeout in seconds
    -I, --ignore-times          don’t skip files that match size and time
        --size-only             skip files that match in size
        --modify-window=NUM     compare mod-times with reduced accuracy
    -T, --temp-dir=DIR          create temporary files in directory DIR
    -y, --fuzzy                 find similar file for basis if no dest file
        --compare-dest=DIR      also compare received files relative to DIR
        --copy-dest=DIR         ... and include copies of unchanged files
        --link-dest=DIR         hardlink to files in DIR when unchanged
    -z, --compress              compress file data during the transfer
        --compress-level=NUM    explicitly set compression level
        --skip-compress=LIST    skip compressing files with suffix in LIST
    -C, --cvs-exclude           auto-ignore files in the same way CVS does
    -f, --filter=RULE           add a file-filtering RULE
    -F                          same as --filter=’dir-merge /.rsync-filter’
                                repeated: --filter=’- .rsync-filter’
        --exclude=PATTERN       exclude files matching PATTERN
        --exclude-from=FILE     read exclude patterns from FILE
        --include=PATTERN       don’t exclude files matching PATTERN
        --include-from=FILE     read include patterns from FILE
        --files-from=FILE       read list of source-file names from FILE
    -0, --from0                 all *from/filter files are delimited by 0s
    -s, --protect-args          no space-splitting; wildcard chars only
        --address=ADDRESS       bind address for outgoing socket to daemon
        --port=PORT             specify double-colon alternate port number
        --sockopts=OPTIONS      specify custom TCP options
        --blocking-io           use blocking I/O for the remote shell
        --stats                 give some file-transfer stats
    -8, --8-bit-output          leave high-bit chars unescaped in output
    -h, --human-readable        output numbers in a human-readable format
        --progress              show progress during transfer
    -P                          same as --partial --progress
    -i, --itemize-changes       output a change-summary for all updates
        --out-format=FORMAT     output updates using the specified FORMAT
        --log-file=FILE         log what we’re doing to the specified FILE
        --log-file-format=FMT   log updates using the specified FMT
        --password-file=FILE    read daemon-access password from FILE
        --list-only             list the files instead of copying them
        --bwlimit=KBPS          limit I/O bandwidth; KBytes per second
        --write-batch=FILE      write a batched update to FILE
        --only-write-batch=FILE like --write-batch but w/o updating dest
        --read-batch=FILE       read a batched update from FILE
        --protocol=NUM          force an older protocol version to be used
        --iconv=CONVERT_SPEC    request charset conversion of filenames
        --checksum-seed=NUM     set block/file checksum seed (advanced)
    -4, --ipv4                  prefer IPv4
    -6, --ipv6                  prefer IPv6
        --version               print version number
    (-h) --help                  show this help (see below for -h comment)
    
            
   Rsync can also be run as a daemon, in which case the following options are accepted:
   Rsync can also be run as a daemon, in which case the following options are accepted:

        --daemon                run as an rsync daemon
        --address=ADDRESS       bind to the specified address
        --bwlimit=KBPS          limit I/O bandwidth; KBytes per second
        --config=FILE           specify alternate rsyncd.conf file
        --no-detach             do not detach from the parent
        --port=PORT             listen on alternate port number
        --log-file=FILE         override the "log file" setting
        --log-file-format=FMT   override the "log format" setting
        --sockopts=OPTIONS      specify custom TCP options
    -v, --verbose               increase verbosity
    -4, --ipv4                  prefer IPv4
    -6, --ipv6                  prefer IPv6
    -h, --help                  show this help (if used after --daemon)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment