jaygiang/notes.md

## notes.md

      
    Raw
  

              notes.md
            
          
    Dynamo DB


General Info


DynamoDB is schemaless
use DynamoDb Stream when item is updated to primary table and also inserted into a secondary table
When looking for a good partition key use one with Automatically generated GUID
If large table, use queries instead of scans
In order to work with search queuries:

Specify a key condition expression in the query
Specify a partition key name and value in the equality condition


Accelerator DAX


provides in-memory caching for DynamoDB tables
improves response times for Eventually Consistent reads only
Point to API DAX cluster instead of table

Time to Live (TTL)


allows you to define when items in a table expire so that they can automatically deleted from the databases

DynamoDB Encryption


Offers full SSE-KMS encryption at rest
Enable during creation of DynamoDB Table

Indexes


indexes - enable fast queries on specific data columns, fast than querying the whole table
give you a differen view of your data based on alternate Partition/sort keys
2 types of indexes:

Local Secondary Index

must be created at when you create table
same partition key as your table
different sort key


Global Secondary Index

Can create any time - at table creation or after
different partition key
different sorty key


Parallel Scans


faster than sequential scan
when to use:

when table size is 20gb or larger
table's provisioned throughput is not fully used
sequential scan operations are too slow


Global Tables


deploys multi-region, multi-master database w/o having to build or maintain own replication solution

DynamoDB Streams


captures a time-ordered sequence of item-lvel modifications in any table and stores this information in a log for 24 hours.

Encryption at Rest


Create a new table with encryption enabled
Copy data from exisitng table to new table

S3


General Info


Read-after-write Consistency - If you write a new key to S3, you will be able to retrieve any object immediately afterwards. Also, any newly created object or file will be visible immediately, without any delay
Objects stored in your bucket before you see the versioning state have a version ID of null.

when you enable versioning, existing objects in your bucket do not change
what changes is how S3 handles the objects in future requests


409 Conflict - Bucket name does not exist
S3 buckets in ALL regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS andf DELETES
S3 is perfect for uploading videos
Can enable server side encryption for all objects stored in S3 through bucket policy
Server Access Logs enabled can accumulate overtime and take more space in your s3 bucket

Use lifecylcle config if you want to delete overtime
If want to enable MFA on s3 bucket, then make sure aws:MultiFactorAuthPresent set to false


S3 API


Multipart upload - API enables you to upload large objects in parts or make copy of an existing object

3-step-process -

initiate upload
upload object parts
after uploaded all parts, you complete the multipart upload


Multi-Object Delete - Delete large number of objects from S3
If KMS Encryption enabled, then you are actually making extra KMS API calls which will throttle performance issues

s3 Performance Optimization

2 main approaches to Performance optimization:

GET-Intensive workloads - Use Cloudfront
Mixed-Workloads - Avoid sequential key names for your s3 objects.

Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition

exampleawsbucket/12310sd-13-3-11/photo1.jpg


CloudFront - uses edge locations(where location is cached, can write content)

Permissions


do not forget to give permission to your index.html to ensure EVERYONE has access to read for Static Web Hosting
By default all S3 resources are private, only AWS account that created the resurces can access them

To allow read access, add bucket policy that allows s3:GetObject permission with a condition using AWS:referer key, that the get request must originate from specific webpages


Encryption


Buckets can contain both encrypted and non-encrypted objects
Use x-amz-server-side-encryption in request header to cause an object to be SSE

Server-Side Encryption w/ Amazon S3-Managed Keys (SSE-S3) - Each object is encrypted with a unique key employing strong multi-factor encryption.

As an addtional safeguard, it encrypts the key itself with a master key that it regularly rotates.
S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard(AES-256), to encrypt your data

User Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) - Uses envelope key(a key that protects your data's encryption key) that provides added protection against unauthorized access of your objects in S3

similar to SSE-S3, but w/ addtional benefits
provides audit trail of when your keys was last used and by whom
Option of managing encryption keys on your own
Encryption Process

use customer master key to generate a data key for the encryption process
use plaintext data encryption key to encrypt data locally, then erase from memory
Store encrypted data key alongside the locally encrypted data


Use Server-side Encryption with Customer-Provided Keys(SSE-C) - You manage the encryption keys and S3 manages the encryption, as it writes to disks and decryption, when you access your objects
Storage Numbers


Total volume of data and number of objects you can store are unlimited
Individual S3 object - range in size from 0bytes - 5TB
Largest object in single put - 5GB
Objects more than 100MB - use multipart upload
Max Number of s3 buckets by default - 100 buckets per account

IAM


General Information


Best practice for IAM is to create roles which has specific access to an AWS service and then give the user permission to the AWS service via the role
AWS STS:AssumeRole - API that assumes role which passes the ARN of the role to use
When authenticate and authorize users to access images in S3 bucket

Authenticate user at the application level, use STS tokens to grant access
use key-based naming scheme from all UserIDs for all user objects in a single S3 bucket


For Elastic Contianer Service, to ensure that instances of containers can not access other containers, use Task IAM Roles
aws sts decode-authorization-message:

decodes addtional information about authorization status of an error message.
creates human readable error message


Testing custom policies


Get context keys first
Use the aws iam simulate-custom-policy command

Amazon Cognito


User Pools Cognito:


sign in & sign up services
social sign in with FB, Google, etc
user directory management
comes with MFA

Identity Pools(Federated Identities)


Provides temp AWS credentials for users who are guests(unauthenticated) and for users who have been authenticated & receive a token.
Identity Pool is a store of user identity data specifc to your account
Supports authenication with SAML

Cognito Streams


give developers control and insight into their data store in Cognito

ELB - Elastic Load Balancing


General Information

-Uses Route 53 and DNS for request routing
Elasticache


external in-memory cache key-value store, required for improving session management
Makes it easy to deploy and run Memcached or Redis 
Session state in a central location, so all web servers can share a single copy.
Allows ELB to send requests to any web server, better distribution
Auto-scaling can terminate web servers without losing session state info
Redis

Redis sorted sets helps sort applications featuring leaderboard
Use Redis if you want high availability


Sticky Sessions


non-even distribution of sessions across ELB
ELB sends every request from a specific user to the same web server
greatly limits elasticity

ELB cannot distribute traffic evenly, sends disporportionate ammount of traffic to one server
Auto-scaling cannot terminate web servers w/o losing some user's session state


Access Logs


captures detailed information about requests sent to your load balancer

In VPC


Can make an internal load balancer or internet-facing load balancer
Internet-facing load balancer -create in public subnet

EC2 - Elastic Cloud Computing


General Information


if UNPROTECTED PRIVATE KEY FILE, then run chmod 400 -i yourPem.pem
if want to bootstrap script to an instance, place script in the UserData for the instance

EIP - Elastic IP Address


static IPv4 address designed for dynamic cloud computing
Can mask failure of an instance or software by rapidly remapping the address to another instance in your account

AMI - Amazon Machine Images


Can share AMI with specific AWS accounts w/o making the AMI public.
Can only be shared within a region
AMIs are regional resource. Therefore, sharing an AMI makes it available in that region

To make available in a different region, copy the AMI to the region and then share it.


EC2 Volumes

EBS - Elastic Block Storage

By Default, root volume is deleted when the instance terminates
Data on other EBS volumes persists after instance terminates by default
If EC2 with default parameters, DeleteOnTermination will be true

Instance Stores

Data persists only during the life of the instance/ data lost when instance terminated

EBS Encryption

Uses SSE-KMS encryption with Customer Master Keys(CMKs)
CMK created automatically, but you can create your own

EC2 API


RegisterImage - Final step when creating an AMI
DescribeImages - Describe one or more images(AMIs) that are available to you

VPC - Virtual Private Cloud


General Info


EC2 will need EIP or public IP assigned to it in order to connect to the Internet and send data in or out
VPC does not charge, no hourly rate

Route Table


contains set of rules called routes that are used to determine where network traffic is directed
Each subnet MUST be associated with a route table.
A subnet can only be associated with on route table
A route table could have multiple subnets

VPN - Virtual Private Network


Connect your VPC to remote networks by using VPN connection

SQS - Simple Queue Service


General Information

SQS - fast, reliable, scalable, fully managed message queing service.

makes simple & cost-effective to decouple the components of a cloud application
You can use SQS to transmit any volume of data, without losing messages or requiring other services to be always available

fanout - common design pattern where message published to an SNS topic is distributed to a number of SQS queues in parallel

can build applications that take advantage parallel, asynchronous processing
SQS messages can be delivered to applications that require immediate notification of an event and messages are also persistent in an Amazon SQS queue for other apps to process later in time
message attribute Name, type, value, and message body should not be empty or null
if pricing has 2 tiers(customer & guests), use SQS to process application by high priority queue first for the customer

Caching Strategies

Lazy Loading - loads data into the cache only when requested

Advantages

Only requested data is cached
Node failures are not fatal


Disadvantages

cache miss penalty
stale data
Write-Through - adds data or updates data in the cache whenever data is written to the database


Advantages

Data is never stale
2 trips(write to cache, write to db)


Disadvantage

Missing Data
Waste of resources since some are never read


Storage Numbers

Max SQS Message size - 256KB

Use SetQueueAttributes to set MaximumMessageSize attribute
To send messages larger than 256KB, use Amazon SQS Extended Client Library for Java

MAX SQS queues created - no limit
MAX SQS quesues in free tier - 1 million
MAX SQS maximum visibility timeout - 12 hours
SQS PCI DSS certified - yes it is
anonymous access - Yes it is allowed
Queue Types

Standard Queues(default) -best-effort ordering; message delivered at least once

loose FIFO capability
receiving message in exact order is not guaranteed

FIFO Queues(First in first out) - ordering strictly preserved, message delivered once, no duplicates
Retrieving Messages

Short Polling - returns immediately, even if the message queue being polled is empty
Long Polling - doesn't return response until a message arrives in the message queue, or the long poll times out


makes it inexpensive to retrieve messages from your SQS


use to reduce costs, because it reduces empty receives


To enable: set value of ReceiveMessageWaitTimeSeconds to greater than 0 and less than or equal to 20 seconds


Cost Effective - use Long Polling and SQS API in Batches


SNS - Simple Notification Service


General Information


What your expected to see in SNS message body:

Type
TopicArn
Subject
Signature
MessageId
Message
Timestampe
Signature Version
SigningCertURL
UnsubscribeURL


Process of SNS to mobile


submit notification credentials to SNS
receive Registration ID for each mobile device
pass device token to SNS
SNS creates a mobile subscription endpoint for each device

Amazon CloudWatch


General Information


Real time application and system monitoring
track metrics, collect & monitor log files, set alarms
High-resolution metric - you can set alarm and specify a high-resolution alarm with a period of 10 seconds or 30 seconds
if error data is being received intermittently, then collect and aggregate the results at regular intervals then the data to CloudWatch
Set CloudWatch agent on an instance then configure it to send logs for the web server to a central location in cloudWatch

CI / CD


CodeCommit

Cross-Account Role

You can configure access to AWS CodeCommit repositories for IAM users and groups in another AWS account.

Create cross account role, give the role the priveleges.
Provide the role ARN to the developers


CodeBuild


fully managed build service in cloud

compiles your source code, runs unit tests, and produces artifacts that are ready to deploy


Use AWS CLI to specify different parameters that need to be run for the build

Run command buildspec-location property to set new buildspec.yml file


CodePipeline


continuous delivery service that enables you to model, visualize and automate steps required to release your serveless application


if code will be picked up from S3 bucket and would like to encrypt at rest:

Ensure server-side encryption is enabled on S3 bucket
Configure AWS KMS with customer managed keys and use it for S3 bucket encryption


Use one account for pipeline and another for AWS CodeDeploy for security reasons

to do so, must create customer master key in KMS and add cross-account access


You can build custom action for your pipeline


CodePipeline Wizard - creates S3 artifact bucket and default AWS-managed SSE-KMS encryption keys


If failure detected in build stage then the entire process will stop


Jenkins

if you Jenkins as build provider, then configure EC2 instance with Jenkins installed, then allow IAM Role for EC2 to access Code Pipeline


CodeDeploy


provides deployments according to establised best-practice methods
AppSpec file can be in JSON or YML, and can be changed in console

tells what lambda version to deploy
tells which function to be used as validation tests


Specify --with-decryption option, this allows CodeDeploy service to decrypt password so that it can be used in the application
Use IAM Roles to ensure the CodeDeploy service can access KMS service
3 ways traffice can shift during deployment

canary - shift traffic in two increments
linear - shift traffic in equal elements
All at Once - All traffic shifted from original lambda function at once


CodeStar


Can develop, build, and deploy applications on AWS
Integrates AWS services for your project toolchain
Helps managae complete lifecycle of a project

Lambda


General Information


can increase limit on concurrency on Lambda executions

i.e. a recursive Lambda function
concurrency - when 2 tasks overlap execution
Suggested to avoid using recursive code all together


can create different environment variables in Lambda function to point to different services

i.e. dev, test, production


to access data in VPC, must configure:

Subnet ID
Security Group ID


Can change the timeout for a Lambda function
To validate if your code is working as expected:

insert logging statements into your code
Lambda automatically integrates with Amazon Cloudwatch Logs

Need to enable in IAM role


NOT Cloudwatch metrics, since metrics will only give the rate at which the function is executing, will not actually help you debug


If deployment package of lambda has many external libraries:

Selectively only include the libraries that


Default settings for lambda function is 3 second timeouts and memory is 128gb

Dead Letter Queue


Any Lambda function invoked asynchronously is retried twice before the event is discarded
If retries fail, use Dead Letter Queue to direct unprocessed events to SQS or SNS

X-ray


See traces of Lambda function which can allow you to see detailed level of tracing to your downstream services
Use if you would like how to increase performance
if hosted on EC2 Instance and unable to see XRay trails, make sure x-ray daemon is installed and Ensure IAM role attached to the instance has permission to upoload data on x-ray
To Enable X-ray, must assign AWSXrayWriteOnlyAccess to Lambda function  to is has access to X-Ray Service

CloudTrail


Captures API calls and sends to S3 bucket
recordds what request was made, source ip, who made the request, when was request made, etc.

Lambda@Edge


allows you to run code across aws locations globally without provisioning or managing servers to be triggered by Amazon Cloudfront requests
extension of Lambda, compute service that lets you execute functions that customize the content that CloudFront delivers.

Step Functions


allows you to visualize and test serverless apps in a series of steps
automatically triggers and tracks each step and stops when errors.
logs the state of each step so you can diagnose what is wrong

ALIAS with -routing-config


alias points to a single function version

when alias updated it points to diff function version, then all requests instantly points to the updated version
this exposes to potential instabilities


-routing-config helps with this by allowing yo to point to two different versions of lambda function and dictate what percentage of incoming traffic is sent to each version

RDS


General Information


RDS supports Transparent Data Encryption(TDE) to encrypt stored data on your DB instances running Microsoft SQL servers

API Gateway


API Stage - If customers need to switch to different new API within a certain amount of time, then use API stage to create 'v2'
state variables - name-value pairs that you can define as config attributes associated with a deployment stage of an API

act like environment variables


API Frontend Interaction

Modify Method Request and Method Response


API Backend Interaction

Modify Integration Request and Integration Response


If need to interact with backend(DynamoDB), then must create integration request to forward incoming method request
For client to call your API, you must create a deployment and associate a stage to it
define Request and Response Data Mapping if one content type is JSON and other is XML
To control access to API gateway use AWS Cognito User Pool or Lambda Authorizers
Canary Release Deployment - api traffic separated to production release and canary release

updated api features only visible in canary
good for test coverage or performance


setting up RESTful API

an api gateway with a lambda function to process customer information
Expose GET method in API Gateway


To customize error response set up gateway response to API

Amazon Elastic Beanstalk


General Information


Configuration files can be in YAML or JSON and saved in .ebextensions directory

created and managed locally


if currently on t1 micro and want to change to m4.large, then us Auto Scaling Group CLI command

When you create web server environment, Elastic Beanstalk creates one or more EC2 vm to run web apps on the platform you choose


if planning to deploy on worker role use cron.yaml
Run on EC@ instances that have no persistent local storage
Custom AMI can improve provisioning times when instances are launced in your environment if you need to install a lot of software that isnt included in standard AMI's

Application Lifecycle policy


Everytime you upload new version of your application, it creates new application version, if you don't delete, then you will reach an application version limit
lifecycle policy helps by deleteing old versions or when total limit number has been excedded

Custom Platforms


if you cant see any relavant environments in beanstalk service(i.e docker), then use custom platforms to create from scratch

Deployment Options

All at once – Deploy the new version to all instances simultaneously. All instances in your environment are out of service for a short time while the deployment occurs.
Rolling – Deploy the new version in batches. Each batch is taken out of service during the deployment phase, reducing your environment's capacity by the number of instances in a batch.
Rolling with additional batch – Deploy the new version in batches, but first launch a new batch of instances to ensure full capacity during the deployment process.
Immutable – temp Auto Scaling group launched outside of your environment with seperate set of instances.

old and new instances serve traffic until new instance pass health checks
then new instances are moved to your current Auto Scaling Environment, then temp Auto Scaling Group and instances are terminated

Blue/Green Deployments - deploy new version to a separate environment, then swap CNAMEs to redirect traffic to the new version instantly
Elastic Container Service


General Information


ECS - highly scalable container orchestration service that supports docker containers

General Security


Systems Manager Parameter Store - provides secure, hierarchical storage for configuration data management and secrets management

can store data such as passwords, db strings, and license codes as parameter values


Kinesis


General Information


Kinesis - ingest REAL TIME data, analyze, and persist streaming data
If you have multiple shards for streams, You cannot guarantee the order of multiple shards, only with one
Server side encryption is a feature in Amazon Kinesis

Kinesis Analytics


query data in your stream
build streaming applications using SQL
can preprocess data with Lambda

Kinesis Firehose


delivers real time streaming data to S3, Redshift, Elastic Search, and Splunk
if need to transform data before sent to S3, use Lambda to transform

Encryption at Rest


Enabled server-side data encryption for Kinesis Firehose.

ONLY possible if you use Kinesis stream as your data source
Data now only stored in Kinesis stream


CloudFormation


CloudFormation makes system engineers lives easier, whearas Elastic Beanstalk(sets up automatically)makes lives easier for developers
define all resourcss needed for deployment
if want to deploy lambda function to Multiple AWS account, then use CloudFormation because its infrastruce not development
if cloudformation template has huge list of resources, break templates into smaller managble templates then use AWS::CloudFormation::Stack to reference other templates
if need to configure EC2 instances like NGINX, then use cfn-init helper script

Route 53


Route 53 Weighted

allows you to associate multiple resources with one domain name or subdomain so that you can choose how much traffic is routed to each resource
good for load balancing and testing new versions of software

MISC.


to compensate for network latency use

retries in application code
Exponential backoff algoritm

progressively longer waits between retries for consecutive error responses
Can help stagger the rate of API calls


2 Ways to create Restful API


Lambda(used to host code) and API Gateway(used to accesss API's to point to which Lambda function)
EC2(creat API in EC2 Instance) and Elastic Load Balancer(to do routing)

OpWorks


OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.

AWS Systems Manager Parameter Store


secure storage and or configuration data management and secrets management

can store passwords, database strings, and license codes


Redshift


Data warehouse