Serverless data pipelines: ETL workflow with Step Functions and Athena
This blog is Part 3 of a multi-part series around analysing Flanders’ traffic whilst leveraging the power of cloud components!
For part 1 see: https://medium.com/cloudway/real-time-data-processing-with-kinesis-data-analytics-ad52ad338c6d
For part 2 see: https://medium.com/cloubis/serverless-data-transform-with-kinesis-e468abd33409
What is our goal?
This blog aims to explore the use of the AWS Glue service in conjunction with the AWS Athena service to repartition raw streaming data events.
We previously landed these events on an Amazon S3 bucket partitioned according to the processing time on Kinesis.