Skip to content

Instantly share code, notes, and snippets.

@jitsejan
Created October 5, 2018 13:35
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save jitsejan/557124bcbaf0780ab4efc6054199550a to your computer and use it in GitHub Desktop.
Save jitsejan/557124bcbaf0780ab4efc6054199550a to your computer and use it in GitHub Desktop.
Write a Pandas dataframe to Parquet format on AWS S3.
# Note: make sure `s3fs` is installed in order to make Pandas use S3.
# Credentials for AWS in the normal location ~/.aws/credentials
def _write_dataframe_to_parquet_on_s3(dataframe, filename):
""" Write a dataframe to a Parquet on S3 """
print("Writing {} records to {}".format(len(dataframe), filename))
output_file = f"s3://{DESTINATION}/{filename}/data.parquet"
dataframe.to_parquet(output_file)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment