Skip to content

Instantly share code, notes, and snippets.

@robert-b-clarke
Created January 27, 2014 10:51
Show Gist options
  • Star 16 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save robert-b-clarke/8646606 to your computer and use it in GitHub Desktop.
Save robert-b-clarke/8646606 to your computer and use it in GitHub Desktop.
A simple python script for copying static web resources to an S3 bucket and advance gzipping JS and CSS. Let me know if it's useful (and not already implemented by something else), I may make it into a proper repo
"""
===========
Description
===========
Simple script to copy and gzip static web files to an AWS S3 bucket. S3 is great for cheap hosting of static web content, but by default it does not gzip CSS and JavaScript, which results in much larger data transfer and longer load times for many applications
When using this script CSS and JavaScript files are gzipped in transition, and appropriate headers set as per the technique described here: http://www.jamiebegin.com/serving-compressed-gzipped-static-files-from-amazon-s3-or-cloudfront/
* Files overwrite old versions
* Orphaned files are not deleted
* S3 will not negotiate with clients and will always serve the gzipped version, so user agents must be able to understand the Content-Encoding:gzip header (all modern web browsers can)
=============
Prerequisites
=============
Python >= v2.7
boto
install with pip:
pip install boto
or with apt-get
apt-get install python-boto
=====
Usage
=====
From the command line
python deploy_to_s3.py --directory source-dir --bucket bucket-name
The standard boto environment variables AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID are used for authentication - see boto for details
For help:
python deploy_to_s3.py --help
"""
#!/usr/bin/python
__author__ = 'rob@redanorak.co.uk'
import os, sys, argparse, tempfile, gzip
from boto.s3.connection import S3Connection
from boto.s3.key import Key
def add_file(source_file, s3_key):
"""write a file to an s3 key"""
if source_file.endswith(".js") or source_file.endswith(".css"):
print("gzipping %s to %s" %(source_file, s3_key.key))
gzip_to_key(source_file, s3_key)
else:
print("uploading %s to %s" %(source_file, s3_key.key))
s3_key.set_contents_from_filename(source_file)
def gzip_to_key(source_file, key):
tmp_file = tempfile.NamedTemporaryFile(mode="wb", suffix=".gz", delete=False)
with open(source_file, 'rb') as f_in:
with gzip.open(tmp_file.name, 'wb') as gz_out:
gz_out.writelines(f_in)
key.set_metadata('Content-Type', 'application/x-javascript' if source_file.endswith(".js") else 'text/css')
key.set_metadata('Content-Encoding', 'gzip')
key.set_contents_from_filename(tmp_file.name)
os.unlink(tmp_file.name) #clean up the temp file
def dir_to_bucket(src_directory, bucket):
"""recursively copy files from source directory to boto bucket"""
for root, sub_folders, files in os.walk(src_directory):
for file in files:
abs_path = os.path.join(root, file)
rel_path = os.path.relpath(abs_path, src_directory)
#get S3 key for this file
k = Key(bucket)
k.key = rel_path
add_file(abs_path, k)
def main():
#get arguments
arg_parser = argparse.ArgumentParser(description='Deploy static web resources to an S3 bucket, gzipping JavaScript and CSS files in the process')
arg_parser.add_argument('-d','--directory', help='The source directory containing your static website files', required=True)
arg_parser.add_argument('-b','--bucket', help='The name of the bucket you wish to copy files to, the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables are used for your credentials', required=True)
args = arg_parser.parse_args()
#connect to S3
conn = S3Connection()
target_bucket = conn.get_bucket(args.bucket, validate=False)
dir_to_bucket(args.directory, target_bucket)
if __name__ == '__main__':
main()
@mafux777
Copy link

mafux777 commented Feb 8, 2016

Hey Rob, I had a problem when my bucket had a dot, like so: fancyname.io
When I used a different bucket without the dot, it worked. Error messages:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1274, in connect
server_hostname=server_hostname)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 352, in wrap_socket
_context=self)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 579, in init
self.do_handshake()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 816, in do_handshake
match_hostname(self.getpeercert(), self.server_hostname)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 271, in match_hostname
% (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'fenestro.io.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'

Is this easily fixed? If so, how?

@robert-b-clarke
Copy link
Author

Hi @mafux777

I think this is the same issue as this http://stackoverflow.com/questions/27652318/cant-connect-to-s3-buckets-with-periods-in-their-name-when-using-boto-on-herok

I'm not sure if it's fixed in newer versions of boto, but the workaround described in that stackoverflow answer should work

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment