Skip to content

Instantly share code, notes, and snippets.

@fleveque
Last active November 21, 2022 17:22
Show Gist options
  • Star 22 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save fleveque/816dba802527eada56ab to your computer and use it in GitHub Desktop.
Save fleveque/816dba802527eada56ab to your computer and use it in GitHub Desktop.
Upload folder to S3 recursively with ruby, multi threads and aws-sdk v2 gem, based on http://avi.io/blog/2013/12/03/upload-folder-to-s3-recursively/
#!/usr/bin/env ruby
require 'rubygems'
require 'aws-sdk'
class S3FolderUpload
attr_reader :folder_path, :total_files, :s3_bucket, :include_folder
attr_accessor :files
# Initialize the upload class
#
# folder_path - path to the folder that you want to upload
# bucket - The bucket you want to upload to
# aws_key - Your key generated by AWS defaults to the environemt setting AWS_KEY_ID
# aws_secret - The secret generated by AWS
# include_folder - include the root folder on the path? (default: true)
#
# Examples
# => uploader = S3FolderUpload.new("some_route/test_folder", 'your_bucket_name')
#
def initialize(folder_path, bucket, aws_key = ENV['AWS_KEY_ID'], aws_secret = ENV['AWS_SECRET'], include_folder = true)
Aws.config.update({
region: ENV['AWS_REGION'],
credentials: Aws::Credentials.new(ENV['AWS_KEY_ID'], ENV['AWS_SECRET'])
})
@folder_path = folder_path
@files = Dir.glob("#{folder_path}/**/*")
@total_files = files.length
@connection = Aws::S3::Resource.new
@s3_bucket = @connection.bucket(bucket)
@include_folder = include_folder
end
# public: Upload files from the folder to S3
#
# thread_count - How many threads you want to use (defaults to 5)
# simulate - Don't perform upload, just simulate it (default: false)
# verbose - Verbose info (default: false)
#
# Examples
# => uploader.upload!(20)
# true
# => uploader.upload!
# true
#
# Returns true when finished the process
def upload!(thread_count = 5, simulate = false, verbose = false)
file_number = 0
mutex = Mutex.new
threads = []
puts "Total files: #{total_files}... uploading (folder #{folder_path} #{include_folder ? '' : 'not '}included)"
thread_count.times do |i|
threads[i] = Thread.new {
until files.empty?
mutex.synchronize do
file_number += 1
Thread.current["file_number"] = file_number
end
file = files.pop rescue nil
next unless file
# Define destination path
if include_folder
path = file
else
path = file.sub(/^#{folder_path}\//, '')
end
puts "[#{Thread.current["file_number"]}/#{total_files}] uploading..." if verbose
data = File.open(file)
unless File.directory?(data) || simulate
obj = s3_bucket.object(path)
obj.put(data, { acl: :public_read, body: data })
end
data.close
end
}
end
threads.each { |t| t.join }
end
end
# Sample usage:
# uploader = S3FolderUpload.new('test', 'miles-media-library', AWS_KEY, AWS_SECRET)
# uploader.upload!
@tijmenb
Copy link

tijmenb commented Nov 1, 2018

In newer versions of aws-sdk, you need to use public-read for the permissions:

- obj.put(data, { acl: :public_read, body: data })
+ obj.put(data, { acl: "public-read", body: data })

https://gist.github.com/fleveque/816dba802527eada56ab#file-s3_folder_upload-rb-L78

@SampsonCrowley
Copy link

n newer versions of aws-sdk, you need to use public-read for the permissions:

- obj.put(data, { acl: :public_read, body: data })
+ obj.put(data, { acl: "public-read", body: data })

https://gist.github.com/fleveque/816dba802527eada56ab#file-s3_folder_upload-rb-L78

You also don't pass both the file and options with the file again

- obj.put(data, { acl: :public_read, body: data })
+ obj.put({ acl: "public-read", body: data })

@Kernelogy
Copy link

hi all!

I m getting the below output

[1/5] uploading...
[4/5] uploading...
[3/5] uploading...
[5/5] uploading...
[2/5] uploading...

But when I check with the S3 Bucket there were no files found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment