Skip to content

Instantly share code, notes, and snippets.

@KartikTalwar
Last active September 24, 2024 20:03
Show Gist options
  • Save KartikTalwar/4393116 to your computer and use it in GitHub Desktop.
Save KartikTalwar/4393116 to your computer and use it in GitHub Desktop.
Rsync over SSH - (40MB/s over 1GB NICs)

The fastest remote directory rsync over ssh archival I can muster (40MB/s over 1gb NICs)

This creates an archive that does the following:

rsync (Everyone seems to like -z, but it is much slower for me)

  • a: archive mode - rescursive, preserves owner, preserves permissions, preserves modification times, preserves group, copies symlinks as symlinks, preserves device files.
  • H: preserves hard-links
  • A: preserves ACLs
  • X: preserves extended attributes
  • x: don't cross file-system boundaries
  • v: increase verbosity
  • --numeric-ds: don't map uid/gid values by user/group name
  • --delete: delete extraneous files from dest dirs (differential clean-up during sync)
  • --progress: show progress during transfer

ssh

  • T: turn off pseudo-tty to decrease cpu load on destination.
  • c arcfour: use the weakest but fastest SSH encryption. Must specify "Ciphers arcfour" in sshd_config on destination.
  • o Compression=no: Turn off SSH compression.
  • x: turn off X forwarding if it is on by default.

Original

rsync -aHAXxv --numeric-ids --delete --progress -e "ssh -T -c arcfour -o Compression=no -x" user@<source>:<source_dir> <dest_dir>

Flip

rsync -aHAXxv --numeric-ids --delete --progress -e "ssh -T -c arcfour -o Compression=no -x" [source_dir] [dest_host:/dest_dir]
rsync -aHAXxv --numeric-ids --delete --progress -e "ssh -T -c arcfour -o Compression=no -x" user@<source>:<source_dir> <dest_dir>
@areeb111
Copy link

areeb111 commented Aug 6, 2018

Thank you, I got 110MB/s same as your command but only with rsync -av and ssh attributes

@seandex
Copy link

seandex commented Aug 22, 2018

is arcfour even possible in 2018?

cat /proc/cpuinfo | grep aes
if you have it then use AES ciphers.
moreover, if both-end have it, you can achieve it up to 900Mbps.

@harish2704
Copy link

Thanks for sharing this information.

I tried this and this is what I understood.

  • Network bandwidth utilization doesn't represent amount of data transferred.
  • When I was using this script, I got 90Mb/s network usage and 11MB/s data transfer.
  • When I removed -o compression=no from ssh , data transfer increased to 20MB/s
  • When I enabled z option in rsync datatransfer rate increased to 29MB/s ( and network utilization was 60Mb/s )

Lesson learned:

  • This script may be helpful to when we have un-compressible data and we have Gigabit NIC.
  • but when network becomes bottleneck and/or your data is compressible , this method may not work well.

@scottinan
Copy link

Thanks! From an SSD mdadm array on ext4 to a ZFS raid z2 with 2TBs spinning disks I'm getting maxing out at 1GB lol gonna have to set up a bonded interface! 👍

@sjuxax
Copy link

sjuxax commented Sep 22, 2018

is arcfour even possible in 2018?

cat /proc/cpuinfo | grep aes
if you have it then use AES ciphers.
moreover, if both-end have it, you can achieve it up to 900Mbps.

arcfour is not compiled in to my OpenSSH daemon. chacha20-poly1305@openssh.com is the modern replacement. I get ~100MB/s with that v. capping out around 40MB/s with default ciphers (AES).

@88plug
Copy link

88plug commented Oct 1, 2018

rsync -av --progress -e "ssh -T -c aes128-ctr -o Compression=no -x" source/ user@x.x.x.x:/destination

The key is leaving off the compression (-z) and using aes128-ctr on Ubuntu 18 (latest cipher support)

@MBetters
Copy link

MBetters commented Dec 4, 2018

I'm using this script. It gets rid of the extended attributes ("-X" flag) if you're running this on Windows. It also uses the aes128-gcm cipher in the likely case that your openssh installation doesn't include arcfour.

#!/bin/bash
# Fast rsync command

# Set the RSYNC_ARGS.
UNAME="$(uname -s)"
case "${UNAME}" in 
    Linux* | Darwin*)
        RSYNC_ARGS="-aHAXxv --numeric-ids --delete --progress -e"
    ;;
    # Windows filesystems do not support extended attributes (the "-X" option)
    CYGWIN* | MINGW*)
        RSYNC_ARGS="-aHAxv --numeric-ids --delete --progress -e"
    ;;
    *)
        echo "ERROR: Running on unknown system! Exiting!"
        return 1
    ;;
esac

# Set the SSH_ARGS
SSH_ARGS="-T -o -c aes128-gcm@openssh.com Compression=no -x"

# Get the rest of the args from the caller
USER=$1
SOURCE=$2
SOURCE_DIR=$3
DEST_DIR=$4

rsync $RSYNC_ARGS "ssh $SSH_ARGS" $USER@$SOURCE:$SOURCE_DIR $DEST_DIR

@rfjakob
Copy link

rfjakob commented May 11, 2019

I benchmarked this a little bit with 1GB of random data, modern software (Fedora 30) on a CPU that does NOT have AES acceleration. Test data:

dd if=/dev/urandom of=/tmp/1g bs=1M count=1024

rsync default settings -> 22.71MB/s

rsync -P /tmp/1g 127.0.0.1:/tmp/1g.2

disably tty allocation -> 20.39MB/s

rsync -P /tmp/1g -e "ssh -T" 127.0.0.1:/tmp/1g.2

disable compression -> 193.42MB/s

rsync -P /tmp/1g -e "ssh -o Compression=no" 127.0.0.1:/tmp/1g.2

scp -> 109.9MB/s

scp -o Compression=no /tmp/1g 127.0.0.1:/tmp/1g.2

Conclusion

Disable compression, but don't bother with arcfour, even without AES acceleration you'll be faster that gigabit ethernet. Disabling tty allocation had no effect in my testing. Scp is significantly slower than rsync.

Running ssh -v SERVER you'll what cipher is used:

debug1: kex: server->client cipher: aes256-gcm@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: aes256-gcm@openssh.com MAC: <implicit> compression: none

So we are using aes256-gcm, which fortunately is the same cipher that gocryptfs uses, so we can look at this benchmark table: https://github.com/rfjakob/gocryptfs/wiki/CPU-Benchmarks . You should basically get >100MB/s on any x86 CPU younger than 10 years.

@deajan
Copy link

deajan commented Aug 14, 2019

3fr/3g2/3gp/3gpp/7z/aac/ace/amr/apk/appx/appxbundle/arc/arj/arw/asf/avi/bz2/cab/cr2/crypt[5678]/dat/dcr/deb/dmg/drc/ear/erf/flac/flv/gif/gpg/gz/iiq/iso/jar/jp2/jpeg/jpg/k25/kdc/lz/lzma/lzo/m4[apv]/mef/mkv/mos/mov/mp[34]/mpeg/mp[gv]/msi/nef/oga/ogg/ogv/opus/orf/pef/png/qt/rar/rpm/rw2/rzip/s7z/sfx/sr2/srf/svgz/t[gb]z/tlz/txz/vob/wim/wma/wmv/xz/zip

Added some more extensions to that list:

3fr/3g2/3gp/3gpp/7z/aac/ace/amr/apk/appx/appxbundle/arc/arj/arw/asf/avi/bz/bz2/cab/cr2/crypt[5678]/dat/dcr/deb/dmg/drc/ear/erf/flac/flv/gif/gpg/gz/iiq/jar/jp2/jpeg/jpg/h26[45]/k25/kdc/kgb/lha/lz/lzma/lzo/lzx/m4[apv]/mef/mkv/mos/mov/mp[34]/mpeg/mp[gv]/msi/nef/oga/ogg/ogv/opus/orf/pak/pef/png/qt/rar/r[0-9][0-9]/rz/rpm/rw2/rzip/sfark/sfx/s7z/sr2/srf/svgz/t[gb]z/tlz/txz/vob/wim/wma/wmv/xz/zip

@vrossum
Copy link

vrossum commented Aug 26, 2019

I benchmarked this a little bit with 1GB of random data

On random data compression can't help. However, if you have text files and a slow connection, it is a logical choice.

@maxnoe
Copy link

maxnoe commented Aug 26, 2019

zst is also worth adding

@slmingol
Copy link

slmingol commented Sep 5, 2019

SSH_ARGS="-T -o -c aes128-gcm@openssh.com Compression=no -x"

you have a typo here, that should be like this:

SSH_ARGS="-T -c aes128-gcm@openssh.com -o Compression=no -x"

@slmingol
Copy link

slmingol commented Sep 5, 2019

Saw not much movement in the variations here when testing across 2 ec2 instances on AWS in 2 different AZs. I ended up using this which includes most of what I could gleam as optimizations but the speed fluctuated enough among with and without that your mileage will definitely vary as well.

$ rsync -aHAxv --numeric-ids --delete -P -e "ssh -T -c aes128-gcm@openssh.com -o Compression=no -x" 500MB.file 10.16.87.187:~

Copy link

ghost commented Sep 26, 2019

Regarding the suggested -T, there is no TTY allocation by default when SSH runs a comand on the host vs running a shell. It's redundant. On some systems (Debian 9 and later, maybe some earlier), you can do -c none which is ideal if you care about speed and not privacy.

@danielmotaleite
Copy link

danielmotaleite commented Oct 7, 2019

i have done a small up-to-date ssh test using several cipher, between 2 AWS r5.12xlarge and got this:

chacha20-poly1305@openssh.com        190.89MB/s  (default if not option used)
aes128-ctr                           259.01MB/s
aes256-gcm@openssh.com               339.05MB/s
aes128-gcm@openssh.com               298.20MB/s
none                                 189.72MB/s

i didn't test arcfour, but in previous tests, it was faster... but as require changing the sshd server to support that cipher, i'm trying to avoid it
interesting is the aes256-gcm is faster than the aes128-gcm, probably because of optimization and hardware support. the cipher none, while it do not return error, seems to fallback to the default, so anyone saying that -c none will disable encryption probably do not know that its really using the default cipher! :)

No other ciphers were tested as current ssh only have those as default enabled ciphers

@sasha2002
Copy link

To all.

I'd like to know how to place '--exclude /backup/somedir' in the string because it doesn't work. Thanks in advance.

To exclude a directory("<source_dir>/bigDir") you need to put the name of directory in current directory like this example :
rsync -aHAXxv --numeric-ids --delete --progress --exclude 'bigDir' -e "ssh -T -c aes256-gcm@openssh.com -o Compression=no -x" user@:<source_dir> <dest_dir>

@davidbitton
Copy link

for a Mac to Linux transfer it's useful to use other options. Arcfour is not available for most new machines anymore and UTF-8 on OS X is different than UTF-8 on Linux (important if you have Umlauts like Germans, Samba/NFS will fail otherwise). My command if both (Mac & Linux) machines support AES on their processors and you want to transfer from Mac to Linux:

rsync -rltv --progress --human-readable --delete --iconv=utf-8-mac,utf-8 -e 'ssh -T -c aes128-gcm@openssh.com -o Compression=no -x' <local_mac_source> <remote_linux_dest>

reverse the iconv option if you want to transfer from Linux to Mac.

how would you do mac to mac?

@L1so
Copy link

L1so commented Jun 3, 2020

Not recommended, I almost trashed my entire movies collection by doing this, good thing I canceled it.

@brianlamb
Copy link

This was such a great post to find!
I was set to leave my transfer going at 5-10MB/s but couldn't go to sleep with 1.2TB going for 30hours!
(This was also from Synology NAS to MacOS)

As others mentioned early on in this post using specific SSH options can affect the transfer rate dramatically: -e "ssh -T -c aes128-ctr -o Compression=no -x". Primarily the Compression factor. I couldn't see notable differences and didn't test more than comparing to "-c aes256-gcm@openssh.com" but got variably up to 50-90MB/sec.

DO use --dry-run and --itemize-changes which is a great record of what is actually going to happen.
Always be careful of SOURCE and DESTINATION.
Pause and think, before setting things in motion!

If you want a little help managing a collection of commands you run and an environment conducive to setting up rsync command lines you could try (on a Mac) RsyncOSX as a GUI front end (although I still prefer to run the actual command in a standalone terminal.)

@pricesgoingup
Copy link

@L1so > Not recommended, I almost trashed my entire movies collection by doing this, good thing I canceled it.

Don't paste everything you see on the internet without looking the flags up first /shrugs

@nerrons
Copy link

nerrons commented Mar 29, 2021

To decide which cipher is the best, I recommend using this script to benchmark for yourself: https://gist.github.com/joeharr4/c7599c52f9fad9e53f62e9c8ae690e6b

@anacondaq
Copy link

dunno, dunno
image

git, tons of settings tried. 100gb+ repos. Millions of files, etc of content. Just for bench purposes (not working dirs). But results pretty sad on internal network on Vultr.

@j4ys0n
Copy link

j4ys0n commented May 19, 2022

strange - i'm only getting 27-30MB/s with rsync -aHAxv --numeric-ids --progress -e "ssh -T -c aes128-gcm@openssh.com -o Compression=no -x"
transferring a 1.6TB file between two Epyc servers with plenty of resources and 10GB networking. i get 500MB/s transferring video files from my desktop to my storage server. not sure what's up here.

update: it increased to 52 MB/s, which is... definitely not as fast as I would like, but it's fine.

@sharkymcdongles
Copy link

Not recommended, I almost trashed my entire movies collection by doing this, good thing I canceled it.

🤡

@ip-rw
Copy link

ip-rw commented Sep 11, 2022

I'm not sure if people are still interested in this, but if you don't care about encryption then tar + netcat is by far the quickest way to transfer directories:

destination:
nc -l -p 7777 | tar -xpf -

source:
tar -cf - sourceDir/ | nc [dest ip] 7777

throw in 'pv' to see xfer speed.

@pricesgoingup
Copy link

pricesgoingup commented Sep 11, 2022 via email

@jaimehrubiks
Copy link

Great discussion.

I found this to be the best option. "ssh -T -c aes256-gcm@openssh.com -o Compression=no -x" .Probably aes256 was faster than arcfour due to hardware optimizations or something. Also might play with/without rsync -z based on the quantity/size of the files to transfer. No compression was faster for an already compressed single big tar.gz file

@pricesgoingup
Copy link

pricesgoingup commented Sep 27, 2022 via email

@schmorp
Copy link

schmorp commented Mar 3, 2024

To not let this stand as is, some facts: compression is off by default in ssh (and always has been in openssh), tty allocation is off when used in rsync and x forwarding does not affect bulk bandwidth in any way. Any difference in speed measured is not due to these options, but more likely because of a bad test setup, such as first making tests with cold disk cache and the with hot cache. The only change that can affect speed is the cipher (and not turning compression explicitly on in rsync or ssh).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment