Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Saphyel/edf011f493fe8245622fa0ef2d54351c to your computer and use it in GitHub Desktop.
Save Saphyel/edf011f493fe8245622fa0ef2d54351c to your computer and use it in GitHub Desktop.
AWS Certified Solutions Architect Associate 2020

AWS Certified Solutions Architect Associate 2020

AWS Fundamentals

IAM?
  • Billed by second. T2.micro is free.
  • Security Groups to your IP
  • Time out issues => Security Groups issues
  • Security Groups can refer to other Security Groups
  • Private IP (192.168.1.1) Public IP (1.2.5.6) Elastic IP (1.1.1.1)
  • SSH key permissions => chmod 0400 file
OS Tool
*Unix & windows 10 SSH
Legacy windows PuTTy
Any EC2 Instance connect
EC2
  • EC2 User Data script executed at boot time
  • AMI for faster boot
  • AMI can be copied to different account/regions
  • EC2 requires VPC (AZ), EBS (Storage), Security Group (inbound/outbound), SSH
EC2 Launch Mode Description
On Demand Pay for running it
Reserved Pay for a contract
Spot instance Bid for the instance
Dedicated host Rent the machine
EC2 Instance Type Useful
T2/T3 burst instance or unlimited
Medium Web app
I/O DB
RAM Cache
CPU Compute / DB
GPU Machine learning / video rending
EC2 Placement Groups Description
Partition Same AZ
Cluster Same AZ
Spread Different AZ

High Availability and scalability

Scalability
Vertical scalability

Increase the size, it's good for RDS and it's refer as scale up and down

Horizontal scalability

Number of instances it's common in web apps and it's refer as scale out and in

Load Balancer
Benefits
  • Spread load across multiple downstream instances
  • Exposes a single DNS to your apps
  • Seamlessly handle failures
  • Provide SSL termination for your apps
  • Separate public from private traffic
  • Stickiness with cookies
  • High availability
  • Uses X.509 certificate and you can manage them in (ACM) or upload your own one
ALB
  • Multiple http apps across machines (target group)
  • Multiple app on the same machine (containers)
  • Based on route in URL (ex.com/users)
  • Based on hostname in URL (users.ex.com)
  • Client IP is in the header X-Forwarded-For
NLB
  • Less latency than ALB
  • Forward TCP traffic
  • Support static IP or elastic IP
  • Directly see client IP
Stickiness
  • Bring imbalance to the load over the backend
  • Redirect always to the same instance behind
Auto Scaling Group
  • Ensures scales out to maximum machine or scale in to the minimum machines running depending on the load based on metrics of CloudWatch alarms triggers a scaling.
  • You can use defaults (CPU Usage, requests, network In/Out) or custom metrics (number of connected users) or schedule
  • Termination process: first find the AZ which has the most number of instances and choose the oldest.
  • Cool down prevents terminate/launch additional instances (good to reduce the default 300secs in the scale-in policy)

EC2 Storage

Instance storage
  • Best IO
  • Ephemeral storage (survives reboots only)
  • It's a physical drive
  • Only available for the a big EC2
EBS
  • It's a network drive
  • Can be attach to only one instance
  • It's locked by AZ
  • Migrate EBS requires to create a snapshot
  • Snapshot requires a lot IO
  • If EC2 gets terminated so will be the root EBS by default
  • Disk IO is high increase volume size (for GP2)
  • Bill by capacity
  • RAID 0: Combine volume to increase IO but if one disk fails and you lose everything
  • RAID 1: Increases fault tolerance duplicating data
EBS Type Disk Description
GP2 SSD general purpose. Can be used as boot volume
IOI SSD High performance. Can be used as boot volume
STI HDD low cost
SCI HDD cheapest bad at everything
EFS
  • Mounted in many EC2
  • Multi AZ
  • Uses NFSv4.1 protocol
  • Linux AMI
  • Bill per use
  • Highly available, scalable, 3x GP2

AWS Fundamentals II

RDS AWS offer OS patching level, continuous backup, dashboards, read replicas, Multi AZ, no ssh. Read replicas are async, up to 5, can be promoted, Disaster recovery is sync and has failover

Backups are automatically enabled by aws full snapshot of DB, with logs in real time, retention from 7 to 35 days

manual snapshot are permanent

Encryption at rest, SSL for in flight encryption of the data To connect use SSL Trust certificate

RDS is usually deployed in a private subnet, and security relays in Security groups (communication) IAM polices handle who can manage the instance IAM Users or login/password for login into the DB you need to worry about inbound rules, in-db users and allow SSL

is slower and cheaper than aurora.

Transparent Data Encryption is only available for oracle and SQL Server and can be used on top of KMS IAM auth works only on MySQL and PSQL lifespan is 15mins generated by AWS Credentials and requires SSL

Aurora

is not open source aws cloud optimised better performance 15 replicas and replication process is faster instant failover more expensive

Encryption at rest using KMS, encryption in flight like MySQL, IAM authentication.

Aurora serverless no need to choose instance size, MySQl 5.6 only DB clusters starts, shutdown and scales based on CPU/connections you can migrate from or to Aurora clusters it's measured with Aurora capacity unit (ACU) billed in 5 mins increment. Doesn't support all the features of cluster.

Aurora global databases span multiple regions and enable DR. the DR can be used for lower latency read.

ElastiCache

it's like RDS, write scaling using sharding and read scaling using read replicas helps with staless apps. cache must have an invalidation strategy

Redis

it's a in-memory key-value store low latency by default has persistance good for: User session, leaderboard, distributed states, pub /sub messages lazy loading: all read data is cached data can become stale in cache. Write through: add/update data when written to a db

security it has auth (user/password) SSL in flight must be enabled and used Memcached support SASL IAM auth not supported IAM polices for API security


Route 53

It is a Managed Domain Named system collection of rules and records which helps clients to understand how to reach a server through URL

  • A: URL to IPv4
  • AAAA: URL to IPv6
  • CNAME: URL to URL (non-root domains only)
  • Alias: URL to AWS resource (free and health checks)

Healthy: after 3 passed checks Unhealthy: have 3 failed checks Interval is 30secs and about 15 health checkers will check the endpoint (so every 2secs in average)

Latency routing policy: latency is evaluated in terms of user to designated AWS region Geo Location routing policy: based on user location (requires a default policy when there's no match on location) Weighted routing policy: control the %, can have health checks. Simple routing policy: redirect to single resource, no HC. If there's multiple the client choose 1. MultiValue routing policy: up to 8 HC for each multivalue query.

Some domains may come with DNS features


Classic Solutions

ElasticBeanStalk Managed service with configurable deployment strategy it has 3 architecture models: Single instance deployment: good for dev LB + ASG: good for prod and pre web apps ASG: good for prod non-web apps


S3

Buckets They are unique names and defined at region level naming convention: no underscore, no uppercase, not an IP, 3-63 characters long, starts with letter or number versioning it's enabled at bucket level versioning i's an incremental int (1, 2, 3)

Objects Keys are the full path Values are max size of 5TB and more than 5GB upload requires multi-part Tags - useful for security/lifecycle Metadata - System/User metadata

Encryption SSE-S3 Object is encrypted at server side. Keys managed by S3 Must set the header: x-amz-server-side-encryption: AES56 SSE-KMS Object is encrypted at server side. Keys managed by KMS (User control + audit trail) must set the header: x-amz-server-side-encryption: aws:kms SSE-C Object is encrypted at server side. Keys outside of AWS HTTPS must be used encryption key must be provided for every request Client side encryption Client must encrypt before sending and decrypt after retrieveing

Security Policies grant public access, force objects to be encrypted at upload, set of API to allow or deny User based uses IAM policies Resource based can be Bucket policies or Object/Bucket Access Control List (ACL) S3 access log can be stored in other S3 bucket API calls van be logged in CloudTrail MFA can be required for deleting versioned buckets Signed URLs have a URL valid only for a limited time

S3 websites Host static websites URL: my-bucket.s3-website.aws-region.amazonaws.com or my-bucket.s3-website-aws-region.amazonaws.com Allow public reads to avoid 403 errors CORS allows you to request data from other S3 so other websites can't access (reducing your AWS cost!)

Consistency Model GET 404 -> PUT 200 -> GET 404 ~> GET 200 DELETE 200 -> GET 200 PUT 200 -> PUT 200 -> GET 200 (from the 1st)


Advance S3

MFA-Delete Requires: enabled versioning in the bucket and AWS CLI be the bucket owner Allow to: permanently delete an object version & suspend versioning Not required for: enabling versioning, listing deleted versions

Bucket policies evaluated before default encryption Log any request made to S3, from any account, allowed or denied into another S3 Cross replication must enable versioning and it's async Pre-signed URL: For downloads you need the cli and for uploads you need the sdk

CloudFront It's a CDN, improves read performance and cache content at the edge can provide SSL encryption supports RTMP protocol (videos/media) Signed URL needs the SDK to generate the URL (shared content should last few mins and private content can last year)

S3 Tiers S3 Standard - GP High availability and durability S3 Standard-Infrequent Access - IA Lower cost than GP (Good for backups) S3 One Zone-Infrequent Access Lower cost than IA and supports SSL at transit and encryption at rest S3 Reduced Redundancy Storage (deprecated) S3 Intelligent tiering small monthly monitoring and auto-tiering fee Glacier Cost is storage/month + retrieval cost, each item is called archive and are stored in vaults

Lifecycle rules Transition actions: Defines when objects are transitioned to another storage class Expiration actions: S3 can delete expired items after a configured time.

Snowball physical data transport solution that helps moving TBs or PBs of data in or out of AWS Uses KMS 256 bit encryption You need to request the device, install the client and ship back the device Edge allows computational capability (process on the go with EC2 AMI or lambda functions) Snowball mobile is great for 10PB or more.

Storage Gateway Bridge between on-premise data and cloud data in S3 (used in between app and S3) File Gateway accessible with NFS and SMB protocols most recent data is cached can be mounted on many servers file access Volume Gateway accessible with iSCI protocol from S3 backed by S3 and EBS volumes most recent data is cached Volume / Block storage Tape Gateway accessible with iSCI protocol from S3 uses Virtual Tape Library backed by Glacier and S3 backup

Athena Serverless service to analyse data directly on S3 Charge for query (SQL) and data scanned


Decoupling apps

SQS - Queue Model

Default retention is 4days (up to 14) Unlimited messages (body up to 256kb) Low latency Can be duplicated or out of order Delay message up to 15mins (default 0s) can poll up to 10 messages (message become invisible) have a configurable visibility timeout Create a Dead Letter Queue designate it as DLQ and apply the redrive policy

FIFO available if queue name ends with ".fifo" allow de-duplication and messages sent once messages groups with only an extra tag

SNS - Pub/Sub model

Producer sends message to one topic To pub you need the SDK, create topic, create sub and publish to the topic SNS + SQS for Fan Out (No data loss)

Kineses - Real time streaming model

alternative to kafka great for logs, iot, metrics, big data

Kinesis Streams Data retention is 1 day (up to 7days) can replay data once data is inserted can not be deleted (immutability) billing per shard One stream can have many different shards and ordered by shard

Kinesis API Put records Messages sent get a sequence number try to prevent hot partition same key go to same partition use batching to reduce cost avoid hot shard

Security Use IAM policies Encryption in flight with https and at rest with KMS

Firehouse near real time fully managed (no UI) automatic scaling pay for the amount of data going through (and conversion format) load into Redshift / S3 / ES / Splunk

Amazon MQ Apache ActiveMQ doesn't scale has queue feature (SQS) and topic feature (SNS) runs on dedicated machine


Serverless

Configuration Timeout 3 secs (up to 900secs) env vars (size 4KB max) allocated memory (128mb to 3gb) deploy within a VPC IAM execution role must be attached to the lambda function disk capacity in /tmp is 512mb concurrency limit is 1000 deployment size compressed is 50MB and uncompressed is 250MB

DynamoDB


Serverless II


Databases


Monitoring and audit


Security and encryption


VPC


Other services

Here's a quick cheat-sheet to remember all these services:

CodeCommit: service where you can store your code. Similar service is GitHub

CodeBuild: build and testing service in your CICD pipelines

CodeDeploy: deploy the packaged code onto EC2 and AWS Lambda

CodePipeline: orchestrate the actions of your CICD pipelines (build stages, manual approvals, many deploys, etc)

CloudFormation: Infrastructure as Code for AWS. Declarative way to manage, create and update resources.

ECS (Elastic Container Service): Docker container management system on AWS. Helps with creating micro-services.

ECR (Elastic Container Registry): Docker images repository on AWS. Docker Images can be pushed and pulled from there

Step Functions: Orchestrate / Coordinate Lambda functions and ECS containers into a workflow

SWF (Simple Workflow Service): Old way of orchestrating a big workflow.

EMR (Elastic Map Reduce): Big Data / Hadoop / Spark clusters on AWS, deployed on EC2 for you

Glue: ETL (Extract Transform Load) service on AWS

OpsWorks: managed Chef & Puppet on AWS

ElasticTranscoder: managed media (video, music) converter service into various optimized formats

Organizations: hierarchy and centralized management of multiple AWS accounts

Workspaces: Virtual Desktop on Demand in the Cloud. Replaces traditional on-premise VDI infrastructure

AppSync: GraphQL as a service on AWS

SSO (Single Sign On): One login managed by AWS to log in to various business SAML 2.0-compatible applications (office 365 etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment