Saphyel/2020 AWS Certified Solutions Architect Associate.md

## 2020 AWS Certified Solutions Architect Associate.md

      
    Raw
  

              2020 AWS Certified Solutions Architect Associate.md
            
          
    AWS Certified Solutions Architect Associate 2020

AWS Fundamentals


IAM?

Billed by second. T2.micro is free.
Security Groups to your IP
Time out issues => Security Groups issues
Security Groups can refer to other Security Groups
Private IP (192.168.1.1) Public IP (1.2.5.6) Elastic IP (1.1.1.1)
SSH key permissions => chmod 0400 file


OS
Tool


*Unix & windows 10
SSH


Legacy windows
PuTTy


Any
EC2 Instance connect


EC2

EC2 User Data script executed at boot time
AMI for faster boot
AMI can be copied to different account/regions
EC2 requires VPC (AZ), EBS (Storage), Security Group (inbound/outbound), SSH


EC2 Launch Mode
Description


On Demand
Pay for running it


Reserved
Pay for a contract


Spot instance
Bid for the instance


Dedicated host
Rent the machine


EC2 Instance Type
Useful


T2/T3
burst instance or unlimited


Medium
Web app


I/O
DB


RAM
Cache


CPU
Compute / DB


GPU
Machine learning / video rending


EC2 Placement Groups
Description


Partition
Same AZ


Cluster
Same AZ


Spread
Different AZ


High Availability and scalability


Scalability
Vertical scalability

Increase the size, it's good for RDS and it's refer as scale up and down
Horizontal scalability

Number of instances it's common in web apps and it's refer as scale out and in


Load Balancer
Benefits


Spread load across multiple downstream instances
Exposes a single DNS to your apps
Seamlessly handle failures
Provide SSL termination for your apps
Separate public from private traffic
Stickiness with cookies
High availability
Uses X.509 certificate and you can manage them in (ACM) or upload your own one

ALB


Multiple http apps across machines (target group)
Multiple app on the same machine (containers)
Based on route in URL (ex.com/users)
Based on hostname in URL (users.ex.com)
Client IP is in the header X-Forwarded-For

NLB


Less latency than ALB
Forward TCP traffic
Support static IP or elastic IP
Directly see client IP

Stickiness


Bring imbalance to the load over the backend
Redirect always to the same instance behind


Auto Scaling Group

Ensures scales out to maximum machine or scale in to the minimum machines running depending on the load
based on metrics of CloudWatch alarms triggers a scaling.
You can use defaults (CPU Usage, requests, network In/Out) or custom metrics (number of connected users) or schedule
Termination process: first find the AZ which has the most number of instances and choose the oldest.
Cool down prevents terminate/launch additional instances (good to reduce the default 300secs in the scale-in policy)


EC2 Storage


Instance storage

Best IO
Ephemeral storage (survives reboots only)
It's a physical drive
Only available for the a big EC2


EBS

It's a network drive
Can be attach to only one instance
It's locked by AZ
Migrate EBS requires to create a snapshot
Snapshot requires a lot IO
If EC2 gets terminated so will be the root EBS by default
Disk IO is high increase volume size (for GP2)
Bill by capacity
RAID 0: Combine volume to increase IO but if one disk fails and you lose everything
RAID 1: Increases fault tolerance duplicating data


EBS Type
Disk
Description


GP2
SSD
general purpose. Can be used as boot volume


IOI
SSD
High performance. Can be used as boot volume


STI
HDD
low cost


SCI
HDD
cheapest bad at everything


EFS

Mounted in many EC2
Multi AZ
Uses NFSv4.1 protocol
Linux AMI
Bill per use
Highly available, scalable, 3x GP2


AWS Fundamentals II

RDS
AWS offer OS patching level, continuous backup, dashboards, read replicas, Multi AZ, no ssh.
Read replicas are async, up to 5, can be promoted,
Disaster recovery is sync and has failover
Backups are automatically enabled by aws
full snapshot of DB, with logs in real time, retention from 7 to 35 days
manual snapshot are permanent
Encryption at rest, SSL for in flight encryption of the data
To connect use SSL Trust certificate
RDS is usually deployed in a private subnet, and security relays in Security groups (communication)
IAM polices handle who can manage the instance
IAM Users or login/password for login into the DB
you need to worry about inbound rules, in-db users and allow SSL
is slower and cheaper than aurora.
Transparent Data Encryption is only available for oracle and SQL Server and can be used on top of KMS
IAM auth works only on MySQL and PSQL
lifespan is 15mins generated by AWS Credentials and requires SSL
Aurora
is not open source
aws cloud optimised
better performance 15 replicas and replication process is faster
instant failover
more expensive
Encryption at rest using KMS, encryption in flight like MySQL, IAM authentication.
Aurora serverless
no need to choose instance size, MySQl 5.6 only DB clusters starts, shutdown and scales based on CPU/connections
you can migrate from or to Aurora clusters
it's measured with Aurora capacity unit (ACU) billed in 5 mins increment.
Doesn't support all the features of cluster.
Aurora global databases span multiple regions and enable DR. the DR can be used for lower latency read.
ElastiCache
it's like RDS, write scaling using sharding and read scaling using read replicas
helps with staless apps.
cache must have an invalidation strategy
Redis
it's a in-memory key-value store
low latency
by default has persistance
good for: User session, leaderboard, distributed states, pub /sub messages
lazy loading: all read data is cached data can become stale in cache.
Write through: add/update data when written to a db
security
it has auth (user/password)
SSL in flight must be enabled and used
Memcached support SASL
IAM auth not supported
IAM polices for API security

Route 53

It is a Managed Domain Named system
collection of rules and records which helps clients to understand how to reach a server through URL

A: URL to IPv4
AAAA: URL to IPv6
CNAME: URL to URL (non-root domains only)
Alias: URL to AWS resource (free and health checks)

Healthy: after 3 passed checks
Unhealthy: have 3 failed checks
Interval is 30secs and about 15 health checkers will check the endpoint (so every 2secs in average)
Latency routing policy: latency is evaluated in terms of user to designated AWS region
Geo Location routing policy: based on user location (requires a default policy when there's no match on location)
Weighted routing policy: control the %, can have health checks.
Simple routing policy: redirect to single resource, no HC. If there's multiple the client choose 1.
MultiValue routing policy: up to 8 HC for each multivalue query.
Some domains may come with DNS features

Classic Solutions

ElasticBeanStalk
Managed service with configurable deployment strategy
it has 3 architecture models:
Single instance deployment: good for dev
LB + ASG: good for prod and pre web apps
ASG: good for prod non-web apps

S3

Buckets
They are unique names and defined at region level
naming convention: no underscore, no uppercase, not an IP, 3-63 characters long, starts with letter or number
versioning it's enabled at bucket level
versioning i's an incremental int (1, 2, 3)
Objects
Keys are the full path
Values are max size of 5TB and more than 5GB upload requires multi-part
Tags - useful for security/lifecycle
Metadata - System/User metadata
Encryption
SSE-S3
Object is encrypted at server side.
Keys managed by S3
Must set the header: x-amz-server-side-encryption: AES56
SSE-KMS
Object is encrypted at server side.
Keys managed by KMS (User control + audit trail)
must set the header: x-amz-server-side-encryption: aws:kms
SSE-C
Object is encrypted at server side.
Keys outside of AWS
HTTPS must be used
encryption key must be provided for every request
Client side encryption
Client must encrypt before sending and decrypt after retrieveing
Security
Policies grant public access, force objects to be encrypted at upload, set of API to allow or deny
User based uses IAM policies
Resource based can be Bucket policies or Object/Bucket Access Control List (ACL)
S3 access log can be stored in other S3 bucket
API calls van be logged in CloudTrail
MFA can be required for deleting versioned buckets
Signed URLs have a URL valid only for a limited time
S3 websites
Host static websites
URL: my-bucket.s3-website.aws-region.amazonaws.com or my-bucket.s3-website-aws-region.amazonaws.com
Allow public reads to avoid 403 errors
CORS allows you to request data from other S3 so other websites can't access (reducing your AWS cost!)
Consistency Model
GET 404 -> PUT 200 -> GET 404 ~> GET 200
DELETE 200 -> GET 200
PUT 200 -> PUT 200 -> GET 200 (from the 1st)

Advance S3

MFA-Delete
Requires: enabled versioning in the bucket and AWS CLI be the bucket owner
Allow to: permanently delete an object version & suspend versioning
Not required for: enabling versioning, listing deleted versions
Bucket policies evaluated before default encryption
Log any request made to S3, from any account, allowed or denied into another S3
Cross replication must enable versioning and it's async
Pre-signed URL: For downloads you need the cli and for uploads you need the sdk
CloudFront
It's a CDN, improves read performance and cache content at the edge
can provide SSL encryption
supports RTMP protocol (videos/media)
Signed URL needs the SDK to generate the URL (shared content should last few mins and private content can last year)
S3 Tiers
S3 Standard - GP
High availability and durability
S3 Standard-Infrequent Access - IA
Lower cost than GP (Good for backups)
S3 One Zone-Infrequent Access
Lower cost than IA and supports SSL at transit and encryption at rest
S3 Reduced Redundancy Storage (deprecated)
S3 Intelligent tiering
small monthly monitoring and auto-tiering fee
Glacier
Cost is storage/month + retrieval cost, each item is called archive and are stored in vaults
Lifecycle rules
Transition actions: Defines when objects are transitioned to another storage class
Expiration actions: S3 can delete expired items after a configured time.
Snowball
physical data transport solution that helps moving TBs or PBs of data in or out of AWS
Uses KMS 256 bit encryption
You need to request the device, install the client and ship back the device
Edge allows computational capability (process on the go with EC2 AMI or lambda functions)
Snowball mobile is great for 10PB or more.
Storage Gateway
Bridge between on-premise data and cloud data in S3 (used in between app and S3)
File Gateway
accessible with NFS and SMB protocols
most recent data is cached
can be mounted on many servers
file access
Volume Gateway
accessible with iSCI protocol from S3
backed by S3 and EBS volumes
most recent data is cached
Volume / Block storage
Tape Gateway
accessible with iSCI protocol from S3
uses Virtual Tape Library backed by Glacier and S3
backup
Athena
Serverless service to analyse data directly on S3
Charge for query (SQL) and data scanned

Decoupling apps

SQS - Queue Model
Default retention is 4days (up to 14)
Unlimited messages (body up to 256kb)
Low latency
Can be duplicated or out of order
Delay message up to 15mins (default 0s)
can poll up to 10 messages (message become invisible)
have a configurable visibility timeout
Create a Dead Letter Queue designate it as DLQ and apply the redrive policy
FIFO
available if queue name ends with ".fifo"
allow de-duplication and messages sent once
messages groups with only an extra tag
SNS - Pub/Sub model
Producer sends message to one topic
To pub you need the SDK, create topic, create sub and publish to the topic
SNS + SQS for Fan Out (No data loss)
Kineses - Real time streaming model
alternative to kafka
great for logs, iot, metrics, big data
Kinesis Streams
Data retention is 1 day (up to 7days)
can replay data
once data is inserted can not be deleted (immutability)
billing per shard
One stream can have many different shards and ordered by shard
Kinesis API
Put records
Messages sent get a sequence number
try to prevent hot partition
same key go to same partition
use batching to reduce cost
avoid hot shard
Security
Use IAM policies
Encryption in flight with https and at rest with KMS
Firehouse
near real time
fully managed (no UI)
automatic scaling
pay for the amount of data going through (and conversion format)
load into Redshift / S3 / ES / Splunk
Amazon MQ
Apache ActiveMQ
doesn't scale
has queue feature (SQS) and topic feature (SNS)
runs on dedicated machine

Serverless

Configuration
Timeout 3 secs (up to 900secs)
env vars (size 4KB max)
allocated memory (128mb to 3gb)
deploy within a VPC
IAM execution role must be attached to the lambda function
disk capacity in /tmp is 512mb
concurrency limit is 1000
deployment size compressed is 50MB and uncompressed is 250MB
DynamoDB

Serverless II


Databases


Monitoring and audit


Security and encryption


VPC


Other services

Here's a quick cheat-sheet to remember all these services:
CodeCommit: service where you can store your code. Similar service is GitHub
CodeBuild: build and testing service in your CICD pipelines
CodeDeploy: deploy the packaged code onto EC2 and AWS Lambda
CodePipeline: orchestrate the actions of your CICD pipelines (build stages, manual approvals, many deploys, etc)
CloudFormation: Infrastructure as Code for AWS. Declarative way to manage, create and update resources.
ECS (Elastic Container Service): Docker container management system on AWS. Helps with creating micro-services.
ECR (Elastic Container Registry): Docker images repository on AWS. Docker Images can be pushed and pulled from there
Step Functions: Orchestrate / Coordinate Lambda functions and ECS containers into a workflow
SWF (Simple Workflow Service): Old way of orchestrating a big workflow.
EMR (Elastic Map Reduce): Big Data / Hadoop / Spark clusters on AWS, deployed on EC2 for you
Glue: ETL (Extract Transform Load) service on AWS
OpsWorks: managed Chef & Puppet on AWS
ElasticTranscoder: managed media (video, music) converter service into various optimized formats
Organizations: hierarchy and centralized management of multiple AWS accounts
Workspaces: Virtual Desktop on Demand in the Cloud. Replaces traditional on-premise VDI infrastructure
AppSync: GraphQL as a service on AWS
SSO (Single Sign On): One login managed by AWS to log in to various business SAML 2.0-compatible applications (office 365 etc)
OS	Tool
*Unix & windows 10	SSH
Legacy windows	PuTTy
Any	EC2 Instance connect
EC2 Launch Mode	Description
On Demand	Pay for running it
Reserved	Pay for a contract
Spot instance	Bid for the instance
Dedicated host	Rent the machine
EC2 Instance Type	Useful
T2/T3	burst instance or unlimited
Medium	Web app
I/O	DB
RAM	Cache
CPU	Compute / DB
GPU	Machine learning / video rending
EBS Type	Disk	Description
GP2	SSD	general purpose. Can be used as boot volume
IOI	SSD	High performance. Can be used as boot volume
STI	HDD	low cost
SCI	HDD	cheapest bad at everything