Skip to content

Instantly share code, notes, and snippets.

@uhho
uhho / pandas_s3_streaming.py
Last active December 2, 2022 18:57
Streaming pandas DataFrame to/from S3 with on-the-fly processing and GZIP compression
def s3_to_pandas(client, bucket, key, header=None):
# get key using boto3 client
obj = client.get_object(Bucket=bucket, Key=key)
gz = gzip.GzipFile(fileobj=obj['Body'])
# load stream directly to DF
return pd.read_csv(gz, header=header, dtype=str)
def s3_to_pandas_with_processing(client, bucket, key, header=None):
@niranjv
niranjv / install_python_36_amazon_linux.sh
Last active January 30, 2023 21:49
Install Python 3.6 in Amazon Linux
# A virtualenv running Python3.6 on Amazon Linux/EC2 (approximately) simulates the Python 3.6 Docker container used by Lambda
# and can be used for developing/testing Python 3.6 Lambda functions
# This script installs Python 3.6 on an EC2 instance running Amazon Linux and creates a virtualenv running this version of Python
# This is required because Amazon Linux does not come with Python 3.6 pre-installed
# and several packages available in Amazon Linux are not available in the Lambda Python 3.6 runtime
# The script has been tested successfully on a t2.micro EC2 instance (Root device type: ebs; Virtualization type: hvm)
# running Amazon Linux AMI 2017.03.0 (HVM), SSD Volume Type - ami-c58c1dd3
# and was developed with the help of AWS Support