Request Rate and Performance Considerations
AWS S3 Developer Guide (API Version 2006-03-01)How do I ingest a large number of small files from S3? My job looks like it's stalling.
Databricks Cloud support forum threadWhat is the best way to ingest and analyze a large S3 dataset?
Databricks Cloud support forum threadHow can we get S3DistCp running on DBC?
Databricks Cloud support forum threadHow do I improve throughput of S3 writes in a Spark Streaming scenario?
Databricks Cloud support forum threadStall on loading many Parquet files on S3
Databricks Cloud support forum threadStrategies for reading large numbers of files
Apache Spark Users Mailing ListDealing with Hadoop's small files problem
Snowplow Blog Posts3-streamlogger
npm packageMaximizing Amazon S3 Performance
Slide deck from AWS re:Invent 2013 (STG304)The Bleeding Edge: Spark, Parquet and S3
AppsFlyer tech blog post by Arnon Rotem-Gal-Oz
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment