pwillis-els/S3_transfer_optimize.md

## S3_transfer_optimize.md

      
    Raw
  

              S3_transfer_optimize.md
            
          
    Optimizing S3 file transfer speed

There is a nifty tool https://github.com/larrabee/s3sync which can transfer files very quickly with S3,
but it reads all files into memory. If you're transferring lots of large files this process will quickly
get killed by the kernel due to OOM.
The next best thing I have found is to tweak the AWS CLI to use aws s3 sync as fast as possible.
# ~/.aws/config
[default]
s3 =
  # 500 is usable number if you're only running one process,
  # but 100 is more reasonable is you're running multiple
  max_concurrent_requests = 100

  max_queue_size = 10000
  multipart_threshold = 64MB
  multipart_chunksize = 32MB


aws s3 sync s3://some-bucket/some-directory /data/some-directory &
aws s3 sync s3://some-bucket/some-other-dir /data/some-other-dir &
On a t2.large running this configuration, I get about 70MiB/s. It will incur some serious load (~30), so pick an
instance that gives you the most sustained CPU for better performance, or you'll get a ton of CPU steal as the
instance runs out of credits. It uses very little memory comparatively.