Skip to content

Instantly share code, notes, and snippets.

@jaygiang
Created October 5, 2018 01:38
Show Gist options
  • Save jaygiang/954e4c4902e6de7053c78d3e33862a12 to your computer and use it in GitHub Desktop.
Save jaygiang/954e4c4902e6de7053c78d3e33862a12 to your computer and use it in GitHub Desktop.
AWS Certified Developer Notes (June 2018 Release)

Dynamo DB


General Info

  • DynamoDB is schemaless
  • use DynamoDb Stream when item is updated to primary table and also inserted into a secondary table
  • When looking for a good partition key use one with Automatically generated GUID
  • If large table, use queries instead of scans
  • In order to work with search queuries:
    • Specify a key condition expression in the query
    • Specify a partition key name and value in the equality condition

Accelerator DAX

  • provides in-memory caching for DynamoDB tables
  • improves response times for Eventually Consistent reads only
  • Point to API DAX cluster instead of table

Time to Live (TTL)

  • allows you to define when items in a table expire so that they can automatically deleted from the databases

DynamoDB Encryption

  • Offers full SSE-KMS encryption at rest
  • Enable during creation of DynamoDB Table

Indexes

  • indexes - enable fast queries on specific data columns, fast than querying the whole table
  • give you a differen view of your data based on alternate Partition/sort keys
  • 2 types of indexes:
    • Local Secondary Index
      • must be created at when you create table
      • same partition key as your table
      • different sort key
    • Global Secondary Index
      • Can create any time - at table creation or after
      • different partition key
      • different sorty key

Parallel Scans

  • faster than sequential scan
  • when to use:
    • when table size is 20gb or larger
    • table's provisioned throughput is not fully used
    • sequential scan operations are too slow

Global Tables

  • deploys multi-region, multi-master database w/o having to build or maintain own replication solution

DynamoDB Streams

  • captures a time-ordered sequence of item-lvel modifications in any table and stores this information in a log for 24 hours.

Encryption at Rest

  • Create a new table with encryption enabled
  • Copy data from exisitng table to new table

S3


General Info

  • Read-after-write Consistency - If you write a new key to S3, you will be able to retrieve any object immediately afterwards. Also, any newly created object or file will be visible immediately, without any delay
  • Objects stored in your bucket before you see the versioning state have a version ID of null.
    • when you enable versioning, existing objects in your bucket do not change
    • what changes is how S3 handles the objects in future requests
  • 409 Conflict - Bucket name does not exist
  • S3 buckets in ALL regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS andf DELETES
  • S3 is perfect for uploading videos
  • Can enable server side encryption for all objects stored in S3 through bucket policy
  • Server Access Logs enabled can accumulate overtime and take more space in your s3 bucket
    • Use lifecylcle config if you want to delete overtime
    • If want to enable MFA on s3 bucket, then make sure aws:MultiFactorAuthPresent set to false

S3 API

  • Multipart upload - API enables you to upload large objects in parts or make copy of an existing object
    • 3-step-process -
      1. initiate upload
      2. upload object parts
      3. after uploaded all parts, you complete the multipart upload
  • Multi-Object Delete - Delete large number of objects from S3
  • If KMS Encryption enabled, then you are actually making extra KMS API calls which will throttle performance issues

s3 Performance Optimization

2 main approaches to Performance optimization:

  1. GET-Intensive workloads - Use Cloudfront
  2. Mixed-Workloads - Avoid sequential key names for your s3 objects.
    • Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition
      • exampleawsbucket/12310sd-13-3-11/photo1.jpg
  • CloudFront - uses edge locations(where location is cached, can write content)

Permissions

  • do not forget to give permission to your index.html to ensure EVERYONE has access to read for Static Web Hosting
  • By default all S3 resources are private, only AWS account that created the resurces can access them
    • To allow read access, add bucket policy that allows s3:GetObject permission with a condition using AWS:referer key, that the get request must originate from specific webpages

Encryption

  • Buckets can contain both encrypted and non-encrypted objects
  • Use x-amz-server-side-encryption in request header to cause an object to be SSE

Server-Side Encryption w/ Amazon S3-Managed Keys (SSE-S3) - Each object is encrypted with a unique key employing strong multi-factor encryption.

  • As an addtional safeguard, it encrypts the key itself with a master key that it regularly rotates.
  • S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard(AES-256), to encrypt your data

User Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) - Uses envelope key(a key that protects your data's encryption key) that provides added protection against unauthorized access of your objects in S3

  • similar to SSE-S3, but w/ addtional benefits
  • provides audit trail of when your keys was last used and by whom
  • Option of managing encryption keys on your own
  • Encryption Process
    1. use customer master key to generate a data key for the encryption process
    2. use plaintext data encryption key to encrypt data locally, then erase from memory
    3. Store encrypted data key alongside the locally encrypted data

Use Server-side Encryption with Customer-Provided Keys(SSE-C) - You manage the encryption keys and S3 manages the encryption, as it writes to disks and decryption, when you access your objects

Storage Numbers

  • Total volume of data and number of objects you can store are unlimited
  • Individual S3 object - range in size from 0bytes - 5TB
  • Largest object in single put - 5GB
  • Objects more than 100MB - use multipart upload
  • Max Number of s3 buckets by default - 100 buckets per account

IAM


General Information

  • Best practice for IAM is to create roles which has specific access to an AWS service and then give the user permission to the AWS service via the role
  • AWS STS:AssumeRole - API that assumes role which passes the ARN of the role to use
  • When authenticate and authorize users to access images in S3 bucket
    • Authenticate user at the application level, use STS tokens to grant access
    • use key-based naming scheme from all UserIDs for all user objects in a single S3 bucket
  • For Elastic Contianer Service, to ensure that instances of containers can not access other containers, use Task IAM Roles
  • aws sts decode-authorization-message:
    • decodes addtional information about authorization status of an error message.
    • creates human readable error message

Testing custom policies

  1. Get context keys first
  2. Use the aws iam simulate-custom-policy command

Amazon Cognito


User Pools Cognito:

  • sign in & sign up services
  • social sign in with FB, Google, etc
  • user directory management
  • comes with MFA

Identity Pools(Federated Identities)

  • Provides temp AWS credentials for users who are guests(unauthenticated) and for users who have been authenticated & receive a token.
  • Identity Pool is a store of user identity data specifc to your account
  • Supports authenication with SAML

Cognito Streams

  • give developers control and insight into their data store in Cognito

ELB - Elastic Load Balancing


General Information

-Uses Route 53 and DNS for request routing

Elasticache

  • external in-memory cache key-value store, required for improving session management
  • Makes it easy to deploy and run Memcached or Redis
  • Session state in a central location, so all web servers can share a single copy.
  • Allows ELB to send requests to any web server, better distribution
  • Auto-scaling can terminate web servers without losing session state info
  • Redis
    • Redis sorted sets helps sort applications featuring leaderboard
    • Use Redis if you want high availability

Sticky Sessions

  • non-even distribution of sessions across ELB
  • ELB sends every request from a specific user to the same web server
  • greatly limits elasticity
    1. ELB cannot distribute traffic evenly, sends disporportionate ammount of traffic to one server
    2. Auto-scaling cannot terminate web servers w/o losing some user's session state

Access Logs

  • captures detailed information about requests sent to your load balancer

In VPC

  • Can make an internal load balancer or internet-facing load balancer
  • Internet-facing load balancer -create in public subnet

EC2 - Elastic Cloud Computing


General Information

  • if UNPROTECTED PRIVATE KEY FILE, then run chmod 400 -i yourPem.pem
  • if want to bootstrap script to an instance, place script in the UserData for the instance

EIP - Elastic IP Address

  • static IPv4 address designed for dynamic cloud computing
  • Can mask failure of an instance or software by rapidly remapping the address to another instance in your account

AMI - Amazon Machine Images

  • Can share AMI with specific AWS accounts w/o making the AMI public.
  • Can only be shared within a region
  • AMIs are regional resource. Therefore, sharing an AMI makes it available in that region
    • To make available in a different region, copy the AMI to the region and then share it.

EC2 Volumes

EBS - Elastic Block Storage

  • By Default, root volume is deleted when the instance terminates
  • Data on other EBS volumes persists after instance terminates by default
  • If EC2 with default parameters, DeleteOnTermination will be true

Instance Stores

  • Data persists only during the life of the instance/ data lost when instance terminated

EBS Encryption

  • Uses SSE-KMS encryption with Customer Master Keys(CMKs)
  • CMK created automatically, but you can create your own

EC2 API

  • RegisterImage - Final step when creating an AMI
  • DescribeImages - Describe one or more images(AMIs) that are available to you

VPC - Virtual Private Cloud


General Info

  • EC2 will need EIP or public IP assigned to it in order to connect to the Internet and send data in or out
  • VPC does not charge, no hourly rate

Route Table

  • contains set of rules called routes that are used to determine where network traffic is directed
  • Each subnet MUST be associated with a route table.
  • A subnet can only be associated with on route table
  • A route table could have multiple subnets

VPN - Virtual Private Network

  • Connect your VPC to remote networks by using VPN connection

SQS - Simple Queue Service


General Information

SQS - fast, reliable, scalable, fully managed message queing service.

  • makes simple & cost-effective to decouple the components of a cloud application
  • You can use SQS to transmit any volume of data, without losing messages or requiring other services to be always available

fanout - common design pattern where message published to an SNS topic is distributed to a number of SQS queues in parallel

  • can build applications that take advantage parallel, asynchronous processing
  • SQS messages can be delivered to applications that require immediate notification of an event and messages are also persistent in an Amazon SQS queue for other apps to process later in time
  • message attribute Name, type, value, and message body should not be empty or null
  • if pricing has 2 tiers(customer & guests), use SQS to process application by high priority queue first for the customer

Caching Strategies

Lazy Loading - loads data into the cache only when requested

  • Advantages
    • Only requested data is cached
    • Node failures are not fatal
  • Disadvantages
    • cache miss penalty
    • stale data Write-Through - adds data or updates data in the cache whenever data is written to the database
  • Advantages
    • Data is never stale
    • 2 trips(write to cache, write to db)
  • Disadvantage
    • Missing Data
    • Waste of resources since some are never read

Storage Numbers

Max SQS Message size - 256KB

  • Use SetQueueAttributes to set MaximumMessageSize attribute
  • To send messages larger than 256KB, use Amazon SQS Extended Client Library for Java

MAX SQS queues created - no limit

MAX SQS quesues in free tier - 1 million

MAX SQS maximum visibility timeout - 12 hours

SQS PCI DSS certified - yes it is

anonymous access - Yes it is allowed

Queue Types

Standard Queues(default) -best-effort ordering; message delivered at least once

  • loose FIFO capability
  • receiving message in exact order is not guaranteed

FIFO Queues(First in first out) - ordering strictly preserved, message delivered once, no duplicates

Retrieving Messages

Short Polling - returns immediately, even if the message queue being polled is empty

Long Polling - doesn't return response until a message arrives in the message queue, or the long poll times out

  • makes it inexpensive to retrieve messages from your SQS

  • use to reduce costs, because it reduces empty receives

  • To enable: set value of ReceiveMessageWaitTimeSeconds to greater than 0 and less than or equal to 20 seconds

  • Cost Effective - use Long Polling and SQS API in Batches

SNS - Simple Notification Service


General Information

  • What your expected to see in SNS message body:
    • Type
    • TopicArn
    • Subject
    • Signature
    • MessageId
    • Message
    • Timestampe
    • Signature Version
    • SigningCertURL
    • UnsubscribeURL

Process of SNS to mobile

  1. submit notification credentials to SNS
  2. receive Registration ID for each mobile device
  3. pass device token to SNS
  4. SNS creates a mobile subscription endpoint for each device

Amazon CloudWatch


General Information

  • Real time application and system monitoring
  • track metrics, collect & monitor log files, set alarms
  • High-resolution metric - you can set alarm and specify a high-resolution alarm with a period of 10 seconds or 30 seconds
  • if error data is being received intermittently, then collect and aggregate the results at regular intervals then the data to CloudWatch
  • Set CloudWatch agent on an instance then configure it to send logs for the web server to a central location in cloudWatch

CI / CD


CodeCommit

Cross-Account Role

  • You can configure access to AWS CodeCommit repositories for IAM users and groups in another AWS account.
    • Create cross account role, give the role the priveleges.
    • Provide the role ARN to the developers

CodeBuild

  • fully managed build service in cloud
    • compiles your source code, runs unit tests, and produces artifacts that are ready to deploy
  • Use AWS CLI to specify different parameters that need to be run for the build
    • Run command buildspec-location property to set new buildspec.yml file

CodePipeline

  • continuous delivery service that enables you to model, visualize and automate steps required to release your serveless application

  • if code will be picked up from S3 bucket and would like to encrypt at rest:

    • Ensure server-side encryption is enabled on S3 bucket
    • Configure AWS KMS with customer managed keys and use it for S3 bucket encryption
  • Use one account for pipeline and another for AWS CodeDeploy for security reasons

    • to do so, must create customer master key in KMS and add cross-account access
  • You can build custom action for your pipeline

  • CodePipeline Wizard - creates S3 artifact bucket and default AWS-managed SSE-KMS encryption keys

  • If failure detected in build stage then the entire process will stop

  • Jenkins

    • if you Jenkins as build provider, then configure EC2 instance with Jenkins installed, then allow IAM Role for EC2 to access Code Pipeline

CodeDeploy

  • provides deployments according to establised best-practice methods
  • AppSpec file can be in JSON or YML, and can be changed in console
    • tells what lambda version to deploy
    • tells which function to be used as validation tests
  • Specify --with-decryption option, this allows CodeDeploy service to decrypt password so that it can be used in the application
  • Use IAM Roles to ensure the CodeDeploy service can access KMS service
  • 3 ways traffice can shift during deployment
    • canary - shift traffic in two increments
    • linear - shift traffic in equal elements
    • All at Once - All traffic shifted from original lambda function at once

CodeStar

  • Can develop, build, and deploy applications on AWS
  • Integrates AWS services for your project toolchain
  • Helps managae complete lifecycle of a project

Lambda


General Information

  • can increase limit on concurrency on Lambda executions
    • i.e. a recursive Lambda function
    • concurrency - when 2 tasks overlap execution
    • Suggested to avoid using recursive code all together
  • can create different environment variables in Lambda function to point to different services
    • i.e. dev, test, production
  • to access data in VPC, must configure:
    1. Subnet ID
    2. Security Group ID
  • Can change the timeout for a Lambda function
  • To validate if your code is working as expected:
    • insert logging statements into your code
    • Lambda automatically integrates with Amazon Cloudwatch Logs
      • Need to enable in IAM role
    • NOT Cloudwatch metrics, since metrics will only give the rate at which the function is executing, will not actually help you debug
  • If deployment package of lambda has many external libraries:
    • Selectively only include the libraries that
  • Default settings for lambda function is 3 second timeouts and memory is 128gb

Dead Letter Queue

  • Any Lambda function invoked asynchronously is retried twice before the event is discarded
  • If retries fail, use Dead Letter Queue to direct unprocessed events to SQS or SNS

X-ray

  • See traces of Lambda function which can allow you to see detailed level of tracing to your downstream services
  • Use if you would like how to increase performance
  • if hosted on EC2 Instance and unable to see XRay trails, make sure x-ray daemon is installed and Ensure IAM role attached to the instance has permission to upoload data on x-ray
  • To Enable X-ray, must assign AWSXrayWriteOnlyAccess to Lambda function to is has access to X-Ray Service

CloudTrail

  • Captures API calls and sends to S3 bucket
  • recordds what request was made, source ip, who made the request, when was request made, etc.

Lambda@Edge

  • allows you to run code across aws locations globally without provisioning or managing servers to be triggered by Amazon Cloudfront requests
  • extension of Lambda, compute service that lets you execute functions that customize the content that CloudFront delivers.

Step Functions

  • allows you to visualize and test serverless apps in a series of steps
  • automatically triggers and tracks each step and stops when errors.
  • logs the state of each step so you can diagnose what is wrong

ALIAS with -routing-config

  • alias points to a single function version
    • when alias updated it points to diff function version, then all requests instantly points to the updated version
    • this exposes to potential instabilities
  • -routing-config helps with this by allowing yo to point to two different versions of lambda function and dictate what percentage of incoming traffic is sent to each version

RDS


General Information

  • RDS supports Transparent Data Encryption(TDE) to encrypt stored data on your DB instances running Microsoft SQL servers

API Gateway


  • API Stage - If customers need to switch to different new API within a certain amount of time, then use API stage to create 'v2'
  • state variables - name-value pairs that you can define as config attributes associated with a deployment stage of an API
    • act like environment variables
  • API Frontend Interaction
    • Modify Method Request and Method Response
  • API Backend Interaction
    • Modify Integration Request and Integration Response
  • If need to interact with backend(DynamoDB), then must create integration request to forward incoming method request
  • For client to call your API, you must create a deployment and associate a stage to it
  • define Request and Response Data Mapping if one content type is JSON and other is XML
  • To control access to API gateway use AWS Cognito User Pool or Lambda Authorizers
  • Canary Release Deployment - api traffic separated to production release and canary release
    • updated api features only visible in canary
    • good for test coverage or performance
  • setting up RESTful API
    • an api gateway with a lambda function to process customer information
    • Expose GET method in API Gateway
  • To customize error response set up gateway response to API

Amazon Elastic Beanstalk


General Information

  • Configuration files can be in YAML or JSON and saved in .ebextensions directory
    • created and managed locally
  • if currently on t1 micro and want to change to m4.large, then us Auto Scaling Group CLI command
    • When you create web server environment, Elastic Beanstalk creates one or more EC2 vm to run web apps on the platform you choose
  • if planning to deploy on worker role use cron.yaml
  • Run on EC@ instances that have no persistent local storage
  • Custom AMI can improve provisioning times when instances are launced in your environment if you need to install a lot of software that isnt included in standard AMI's

Application Lifecycle policy

  • Everytime you upload new version of your application, it creates new application version, if you don't delete, then you will reach an application version limit
  • lifecycle policy helps by deleteing old versions or when total limit number has been excedded

Custom Platforms

  • if you cant see any relavant environments in beanstalk service(i.e docker), then use custom platforms to create from scratch

Deployment Options

All at once – Deploy the new version to all instances simultaneously. All instances in your environment are out of service for a short time while the deployment occurs.

Rolling – Deploy the new version in batches. Each batch is taken out of service during the deployment phase, reducing your environment's capacity by the number of instances in a batch.

Rolling with additional batch – Deploy the new version in batches, but first launch a new batch of instances to ensure full capacity during the deployment process.

Immutabletemp Auto Scaling group launched outside of your environment with seperate set of instances.

  • old and new instances serve traffic until new instance pass health checks
  • then new instances are moved to your current Auto Scaling Environment, then temp Auto Scaling Group and instances are terminated

Blue/Green Deployments - deploy new version to a separate environment, then swap CNAMEs to redirect traffic to the new version instantly

Elastic Container Service


General Information

  • ECS - highly scalable container orchestration service that supports docker containers

General Security


  • Systems Manager Parameter Store - provides secure, hierarchical storage for configuration data management and secrets management
    • can store data such as passwords, db strings, and license codes as parameter values

Kinesis


General Information

  • Kinesis - ingest REAL TIME data, analyze, and persist streaming data
  • If you have multiple shards for streams, You cannot guarantee the order of multiple shards, only with one
  • Server side encryption is a feature in Amazon Kinesis

Kinesis Analytics

  • query data in your stream
  • build streaming applications using SQL
  • can preprocess data with Lambda

Kinesis Firehose

  • delivers real time streaming data to S3, Redshift, Elastic Search, and Splunk
  • if need to transform data before sent to S3, use Lambda to transform

Encryption at Rest

  • Enabled server-side data encryption for Kinesis Firehose.
    • ONLY possible if you use Kinesis stream as your data source
    • Data now only stored in Kinesis stream

CloudFormation


  • CloudFormation makes system engineers lives easier, whearas Elastic Beanstalk(sets up automatically)makes lives easier for developers
  • define all resourcss needed for deployment
  • if want to deploy lambda function to Multiple AWS account, then use CloudFormation because its infrastruce not development
  • if cloudformation template has huge list of resources, break templates into smaller managble templates then use AWS::CloudFormation::Stack to reference other templates
  • if need to configure EC2 instances like NGINX, then use cfn-init helper script

Route 53


Route 53 Weighted

  • allows you to associate multiple resources with one domain name or subdomain so that you can choose how much traffic is routed to each resource
  • good for load balancing and testing new versions of software

MISC.


  • to compensate for network latency use
    • retries in application code
    • Exponential backoff algoritm
      • progressively longer waits between retries for consecutive error responses
      • Can help stagger the rate of API calls

2 Ways to create Restful API

  1. Lambda(used to host code) and API Gateway(used to accesss API's to point to which Lambda function)
  2. EC2(creat API in EC2 Instance) and Elastic Load Balancer(to do routing)

OpWorks

  • OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.

AWS Systems Manager Parameter Store

  • secure storage and or configuration data management and secrets management
    • can store passwords, database strings, and license codes

Redshift

  • Data warehouse
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment