Skip to content

Instantly share code, notes, and snippets.

@kevherro
Last active July 11, 2019 00:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kevherro/6537947104cd8ee4f70a55eefa9a6437 to your computer and use it in GitHub Desktop.
Save kevherro/6537947104cd8ee4f70a55eefa9a6437 to your computer and use it in GitHub Desktop.
Digital Unity data architecture proposal

Objective

We need to reliably handle new, dynamic data that is generated on a continual basis and present it in real-time. A scooter's location could change in a matter of seconds - we need to be able to detect that change with minimal latency in order to provide accurate asset tracking.

Architecture diagram

digital-unity-diagram

High-level architecture overview

Time gives our data meaning. Therefore, the raw data needs to be processed sequentially and incrementally over sliding time windows.

Given the architecture diagram above:

  1. Source data is written to Kinesis Data Firehose stream
  2. Kinesis Data Firehose invokes Lambda function to transform incoming source data
  3. Kinesis Data Firehose streams transformed data to S3 bucket at a predefined buffer size (megabytes) and interval (seconds)
  4. AWS Database Migration Service (DMS) reads data from source S3 bucket
  5. DMS loads the data into a target database (Amazon DynamoDB)
  6. DynamoDB pushes data to consumers (our application)

Note that step 1 is subject to change given the structure of vendors' data

Services used

Amazon Kinesis Firehose

Function: captures and automatically loads streaming data into S3 bucket

Why: enables the capture and load of source data in near real-time

Amazon Lambda

Function: transforms incoming source data into specified formats

Why: our application expects a uniform data format

Amazon S3

Function: serves as a buffer for incoming streaming data from Kinesis

Why: kinesis concatenates multiple incoming records and adds a UTC time prefix in the format YYYY/MM/DD/HH before writing to S3. because the forward slash (/) creates a level in the S3 hierarchy, DMS can correctly find and transport the correct group data as specified by our buffering configuration

AWS Migration Service

Function: reads data from source S3 bucket and loads them into the target database

Why: transportation of transformed source data

Amazon DynamoDB

Function: serves as the target database to store data

Why: provides the scale and performance demanded by our application

Resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment