Skip to content

Instantly share code, notes, and snippets.

@eltiare
Created November 22, 2012 18:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save eltiare/4132491 to your computer and use it in GitHub Desktop.
Save eltiare/4132491 to your computer and use it in GitHub Desktop.
Multithreaded file copying to S3 via Fog and JRuby
#!/usr/bin/env jruby
# Usage:
# Save this file to /usr/local/bin/copy-to-s3.rb
# You'll want to chmod +x this file to use it on the command line (Linux/OS X: chmod +x /usr/local/bin/copy-to-s3.rb)
# If you're having trouble using JRuby, check out RVM (https://rvm.io/)
# Install the fog gem: gem install fog (after running rvm use jruby, if required)
# This only works if the environment variable JRUBY_OPTS=--1.8. e.g.: export JRUBY_OPTS=--1.8 (Linux/OS X)
# You'll have to edit the options in the file at the top to match your S3 settings.
#
# copy-to-s3.rb local_path s3_prefix
#
# IMPORTANT: This script does not overrite files if they already exist.
#
# Examples:
# copy-to-s3.rb . videos
# This copies anything in the current directory and below to the path /videos
# copy-to-s3.rb pictures
# This copies anything in the directory pictures and below to the path /pictures
# copy-to-s3.rb 'pictures/houses'
# This copies anything in the directory pictures/houses and below to the path /pictures/houses
require 'rubygems'
require 'fog'
require 'benchmark'
require 'thread'
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_access_key_id => 'YOUR_ACCESS_KEY',
:aws_secret_access_key => 'YOUR_SECRET_ACCESS_KEY',
:region => 'us-west-2' # defaults to us-east-1
})
Directory = connection.directories.get 'YOUR_BUCKET_NAME'
FILES_TO_COPY = Queue.new
def recursive_copy(path, prefix = nil)
final_path = prefix ? File.join(prefix, path) : path
final_path.gsub!(/(\.$)|(\.\/)/, '')
Dir.glob(File.join(path, '*')).each { |f|
filename = f.split('/')[-1]
next if filename.match /^\./
if File.directory?(f)
recursive_copy(f, prefix)
else
final_f = File.join(final_path, filename )
FILES_TO_COPY << [final_f, f]
end
}
end
path, prefix = *ARGV
raise ArgumentError.new("You must supply a path. Usage: copy-to-s3.rb . or copy-to-s3.rb /tmp/path/name") unless path
raise ArgumentError.new("Invalid arguments supplied. Only supported options are copy-to-s3.rb path prefix(optional)") if ARGV.size > 2
recursive_copy(path, prefix)
threads = []
5.times do
threads << Thread.new {
while file = FILES_TO_COPY.pop
key, local_path = *file
puts "Copying #{local_path} to #{key}"
if Directory.files.head(key)
puts "File #{key} already exists"
else
time = Benchmark.realtime do
Directory.files.create(
:key => key,
:body => File.open(local_path),
:public => true
)
end
puts "Finished uploading #{key} in #{time} seconds"
end
end
}
end
threads.each { |thread| thread.join }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment