Skip to content

Instantly share code, notes, and snippets.

@aperkaz
Last active August 5, 2020 12:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aperkaz/91e9cd0f425dfee74bd2289748d5478b to your computer and use it in GitHub Desktop.
Save aperkaz/91e9cd0f425dfee74bd2289748d5478b to your computer and use it in GitHub Desktop.
DynamoDB Introduction and key concepts

Introduction to DynamoDB

NoSQL databased, by AWS. Key-value store.

deep dive video

Key features

  • Fully managed. Autoscaling, partitioning...

  • Key-value store. NoSQL DB, but relational (all data stored in a DB is relational!)

  • Querying the DB is done by sending Query Objects over API calls. Only indexed data can be queried.

  • Traditional SQL DBs, optimizes for disk usage (joins very expensive on CPU). NoSQL optimized for CPU usage and distributed access to data.

  • DynamoDB, being part of AWS ecosystem, supports data-derived operations by event streams. AWS Lambdas or custom listener can be connected to the stream, and execute aggregates, computations... Similar to the triggers in SQL DBs.

    • The goal is to execute calculations as data comes in, calculate once and then have it available in the future, without the need of expensive CPU computations.
  • Important: NoSQL is better suited for applications that must read/write data (OLTP, Online Transactional Processing). Not so good for Business Analytics type apps (OLAP), NoSQL does not like complex queries. Differences.

Key concepts

  • Tables: Outter most structures. Usually 1 table is all you need. As tables get spread between partitions, 1 table with good (varied) partition keys ensure balanced load at scale.
  • Partitions: 'buckets' withing the table. Uniquely identifies items, and should be varied (optimally with an even data distribution). If its not varies (ex. male/femaly), the DB wont scale as all the request will hammer specific partitions.
  • Local Seconday Indexes: A way to re-sort the data on the partitions. Should be modeled according to the data access patterns of your app. Unique value within the partition, indexed locally to the partition.

dynamo-db lsi example

  • Global Secondary Indexes: New global aggregation of the data, indexed globally across partitions. Used for supporting access patterns, re-groups the data for supporting secondary access patterns. Supports reverse lookup (movies from actor N -> rev lookup: actors in movie M), and N to N data relations.

Generated asynchronously (as it has to be indexed accross partitions).

dynamo-db gsi example

Data modelling

  • NoSQL DB modeling differs from traditional SQL DB modelling (3rd normal form schemas). The relational connections between elements that are expressed in the SQL DB schema, have to be modelled as hierarchical composite-key 1 table constructs.

  • Check the link above for a deep dive on modelling patters for DynamoDB. Ex, composite partition / lsi keys, using different attributes.

  • Important: on NoSQL DBs, modeling the data structure is key. The modeling has to be aligned with the data access patterns, or the DB will be used wrongly.

    • This means that NoSQL DB are not very flexible, but very scalable.
    • For 1 service (app), usually 1 table is enough. The video provides example where 1 table setup satisfies +20 data access patters.

example on how to model version history in 1 table

@aperkaz
Copy link
Author

aperkaz commented Aug 5, 2020

Images
dynamo-db-lsi
dynamo-db-gsi
maintaining-version-history

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment