Skip to content

Instantly share code, notes, and snippets.

@naveena-s
Created May 29, 2017 03:05
Show Gist options
  • Save naveena-s/ef9bfab2e45154c8e7d56f5a742cad7f to your computer and use it in GitHub Desktop.
Save naveena-s/ef9bfab2e45154c8e7d56f5a742cad7f to your computer and use it in GitHub Desktop.
List files and folders of a bucket of S3 using prefix and delimiter in Ruby
Before we get started, it is important to know few things.
Amazon Simple Storage Service which is also known as Amazon S3 is highly scalable, secure object storage in the cloud. It is used to store and obtain any amount of data at any time and from anywhere on the web. Amazon S3 is mainly used for backup, faster retrieval and reduce in cost as the users have to only pay for the storage and the bandwith used.
Every data that is stored in s3 is considered as object and objects are stored in something called bucket. When you are uploading any file to s3, an object is being stored to the bucket in background. You can set permission to bucket about who can access it, create it or delete the bucket.
The first step is to create the s3 object with proper credentials
#aws_objects.rb
s3 = Aws::S3::Resource.new({
region: ENV['AWS_REGION'],
access_key_id: ENV['AWS_ACCESS_KEY_ID'],
secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
})
Assuming that the heirarchy is as below:
mycollection #bucket name
photos
2017
image1.jpg #photos/2017/image1.jpg
image2.jpg #photos/2017/image2.jpg
2016
myphoto.jpg #photos/2016/myphoto.jpg
image1.jpg #photos/2016/image1.jpg
2010
image1.jpg #photos/2010/image1.jpg
photo
2010
image1.jpg #photo/2010/image1.jpg
2016
image1.jpg #photo/2016/image1.jpg
audio
random.mp3 # audio/random.mp3
2010
one.mp3 #audio/2010/one.mp3
2016
two.mp3 #audio/2016/two.mp3
jan
2016
two.mp3 #audio/jan/2016/two.mp3
one.mp3 #audio/jan/2016/one.mp3
feb
2016
three.mp3 #audio/feb/2016/three.mp3
random1.jpg
random2.mp3
random3.jpg
2016_random.jpg
2016_random2.jpg
2016_random1.mp3
The arguments prefix and delimiter for the objects method is used for sorting the files and folders. Prefix should be set with the value that you want the files or folders to begin with. And delimiter should be set if you want to ignore any file of the folder. You will understand what i mean if you follow the examples.
Example 1:
Suppose you want to list only the files in the bucket present then it should be,
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'', delimiter: '/').collect(&:key)
The output will be all the files present in the first level of bucket. As the prefix is set to nothing, any file which begins with anything will be considered. And delimiter is set to “/” which means only the files which has no “/” will be fetched and if there is any file which has a “/”” will be ignored. Hence, the output will be
random1.jpg
random2.mp3
random3.jpg
2016_random.jpg
2016_random2.jpg
2016_random1.mp3
Example 2:
Suppose you want to list all the files and folders in the bucket present then it should be,
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'', delimiter: '').collect(&:key)
The output will be all the files and folders present in the bucket. Both prefix and delimiter is set to nothing which means any file with any begining and no restriction on the path also. So,
2016_random.jpg
2016_random2.jpg
2016_random1.mp3
audio/
audio/2010/
audio/2010/one.mp3
audio/2016/
audio/2016/two.mp3
audio/feb/
audio/feb/2016/
audio/feb/2016/three.mp3
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
audio/random.mp3
photo/
photo/2010/
photo/2010/image1.jpg
photo/2016/
photo/2016/image1.jpg
photos/
photos/2010/
photos/2010/image1.jpg
photos/2016/
photos/2016/image1.jpg
photos/2016/myphoto.jpg
photos/2017/
photos/2017/image1.jpg
photos/2017/image2.jpg
random1.jpg
random2.mp3
random3.jpg
Example 3:
Suppose you want to list all the contents of the folder photos/2017/ in the bucket then it should be,
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'photos/2017/', delimiter: '').collect(&:key)
Then the output will be,
photos/2017/
photos/2017/image1.jpg
photos/2017/image2.jpg
In the folder photos/2017/, only two files will be sorted because the prefix is set to “photos/2017/” which means, display only those files and folders which begin with photos/2017/ and ignore rest.
Example 4:
Suppose you want to list all the files and folders of the folder audio/ in the bucket then,
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'audio/', delimiter: '').collect(&:key)
Then,
audio/
audio/2010/
audio/2010/one.mp3
audio/2016/
audio/2016/two.mp3
audio/feb/
audio/feb/2016/
audio/feb/2016/three.mp3
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
audio/random.mp3
Example 5:
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'audio/', delimiter: '/').collect(&:key)
At first, it sorts those files and folders which begin with audio/ out of all the files present in the bucket. The result of first sort is
audio/
audio/2010/
audio/2010/one.mp3
audio/2016/
audio/2016/two.mp3
audio/feb/
audio/feb/2016/
audio/feb/2016/three.mp3
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
audio/random.mp3
Next, Out of all the above files it sorts those files which do not have “/” after the prefix. There is only file and hence that is the result
audio/
audio/random.mp3
Example 6:
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'audio/jan/', delimiter: '/').collect(&:key)
First, sorting of files which begin with “audio/jan/” will be collected which gives the result as
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
And now it starts looking if there is any file with no “/” after the prefix and then it concludes there is no file so it returns only the folder name
audio/jan/
I hope this blog helped you understand about how to use delimiter and prefix in ruby. Keep practicing!!!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment