IAM?
- Billed by second. T2.micro is free.
- Security Groups to your IP
- Time out issues => Security Groups issues
- Security Groups can refer to other Security Groups
- Private IP (192.168.1.1) Public IP (1.2.5.6) Elastic IP (1.1.1.1)
- SSH key permissions => chmod 0400 file
OS | Tool |
---|---|
*Unix & windows 10 | SSH |
Legacy windows | PuTTy |
Any | EC2 Instance connect |
EC2
- EC2 User Data script executed at boot time
- AMI for faster boot
- AMI can be copied to different account/regions
- EC2 requires VPC (AZ), EBS (Storage), Security Group (inbound/outbound), SSH
EC2 Launch Mode | Description |
---|---|
On Demand | Pay for running it |
Reserved | Pay for a contract |
Spot instance | Bid for the instance |
Dedicated host | Rent the machine |
EC2 Instance Type | Useful |
---|---|
T2/T3 | burst instance or unlimited |
Medium | Web app |
I/O | DB |
RAM | Cache |
CPU | Compute / DB |
GPU | Machine learning / video rending |
EC2 Placement Groups | Description |
---|---|
Partition | Same AZ |
Cluster | Same AZ |
Spread | Different AZ |
Scalability
Increase the size, it's good for RDS and it's refer as scale up and down
Number of instances it's common in web apps and it's refer as scale out and in
Load Balancer
- Spread load across multiple downstream instances
- Exposes a single DNS to your apps
- Seamlessly handle failures
- Provide SSL termination for your apps
- Separate public from private traffic
- Stickiness with cookies
- High availability
- Uses X.509 certificate and you can manage them in (ACM) or upload your own one
- Multiple http apps across machines (target group)
- Multiple app on the same machine (containers)
- Based on route in URL (ex.com/users)
- Based on hostname in URL (users.ex.com)
- Client IP is in the header X-Forwarded-For
- Less latency than ALB
- Forward TCP traffic
- Support static IP or elastic IP
- Directly see client IP
- Bring imbalance to the load over the backend
- Redirect always to the same instance behind
Auto Scaling Group
- Ensures scales out to maximum machine or scale in to the minimum machines running depending on the load based on metrics of CloudWatch alarms triggers a scaling.
- You can use defaults (CPU Usage, requests, network In/Out) or custom metrics (number of connected users) or schedule
- Termination process: first find the AZ which has the most number of instances and choose the oldest.
- Cool down prevents terminate/launch additional instances (good to reduce the default 300secs in the scale-in policy)
Instance storage
- Best IO
- Ephemeral storage (survives reboots only)
- It's a physical drive
- Only available for the a big EC2
EBS
- It's a network drive
- Can be attach to only one instance
- It's locked by AZ
- Migrate EBS requires to create a snapshot
- Snapshot requires a lot IO
- If EC2 gets terminated so will be the root EBS by default
- Disk IO is high increase volume size (for GP2)
- Bill by capacity
- RAID 0: Combine volume to increase IO but if one disk fails and you lose everything
- RAID 1: Increases fault tolerance duplicating data
EBS Type | Disk | Description |
---|---|---|
GP2 | SSD | general purpose. Can be used as boot volume |
IOI | SSD | High performance. Can be used as boot volume |
STI | HDD | low cost |
SCI | HDD | cheapest bad at everything |
EFS
- Mounted in many EC2
- Multi AZ
- Uses NFSv4.1 protocol
- Linux AMI
- Bill per use
- Highly available, scalable, 3x GP2
RDS AWS offer OS patching level, continuous backup, dashboards, read replicas, Multi AZ, no ssh. Read replicas are async, up to 5, can be promoted, Disaster recovery is sync and has failover
Backups are automatically enabled by aws full snapshot of DB, with logs in real time, retention from 7 to 35 days
manual snapshot are permanent
Encryption at rest, SSL for in flight encryption of the data To connect use SSL Trust certificate
RDS is usually deployed in a private subnet, and security relays in Security groups (communication) IAM polices handle who can manage the instance IAM Users or login/password for login into the DB you need to worry about inbound rules, in-db users and allow SSL
is slower and cheaper than aurora.
Transparent Data Encryption is only available for oracle and SQL Server and can be used on top of KMS IAM auth works only on MySQL and PSQL lifespan is 15mins generated by AWS Credentials and requires SSL
Aurora
is not open source aws cloud optimised better performance 15 replicas and replication process is faster instant failover more expensive
Encryption at rest using KMS, encryption in flight like MySQL, IAM authentication.
Aurora serverless no need to choose instance size, MySQl 5.6 only DB clusters starts, shutdown and scales based on CPU/connections you can migrate from or to Aurora clusters it's measured with Aurora capacity unit (ACU) billed in 5 mins increment. Doesn't support all the features of cluster.
Aurora global databases span multiple regions and enable DR. the DR can be used for lower latency read.
ElastiCache
it's like RDS, write scaling using sharding and read scaling using read replicas helps with staless apps. cache must have an invalidation strategy
Redis
it's a in-memory key-value store low latency by default has persistance good for: User session, leaderboard, distributed states, pub /sub messages lazy loading: all read data is cached data can become stale in cache. Write through: add/update data when written to a db
security it has auth (user/password) SSL in flight must be enabled and used Memcached support SASL IAM auth not supported IAM polices for API security
It is a Managed Domain Named system collection of rules and records which helps clients to understand how to reach a server through URL
- A: URL to IPv4
- AAAA: URL to IPv6
- CNAME: URL to URL (non-root domains only)
- Alias: URL to AWS resource (free and health checks)
Healthy: after 3 passed checks Unhealthy: have 3 failed checks Interval is 30secs and about 15 health checkers will check the endpoint (so every 2secs in average)
Latency routing policy: latency is evaluated in terms of user to designated AWS region Geo Location routing policy: based on user location (requires a default policy when there's no match on location) Weighted routing policy: control the %, can have health checks. Simple routing policy: redirect to single resource, no HC. If there's multiple the client choose 1. MultiValue routing policy: up to 8 HC for each multivalue query.
Some domains may come with DNS features
ElasticBeanStalk Managed service with configurable deployment strategy it has 3 architecture models: Single instance deployment: good for dev LB + ASG: good for prod and pre web apps ASG: good for prod non-web apps
Buckets They are unique names and defined at region level naming convention: no underscore, no uppercase, not an IP, 3-63 characters long, starts with letter or number versioning it's enabled at bucket level versioning i's an incremental int (1, 2, 3)
Objects Keys are the full path Values are max size of 5TB and more than 5GB upload requires multi-part Tags - useful for security/lifecycle Metadata - System/User metadata
Encryption SSE-S3 Object is encrypted at server side. Keys managed by S3 Must set the header: x-amz-server-side-encryption: AES56 SSE-KMS Object is encrypted at server side. Keys managed by KMS (User control + audit trail) must set the header: x-amz-server-side-encryption: aws:kms SSE-C Object is encrypted at server side. Keys outside of AWS HTTPS must be used encryption key must be provided for every request Client side encryption Client must encrypt before sending and decrypt after retrieveing
Security Policies grant public access, force objects to be encrypted at upload, set of API to allow or deny User based uses IAM policies Resource based can be Bucket policies or Object/Bucket Access Control List (ACL) S3 access log can be stored in other S3 bucket API calls van be logged in CloudTrail MFA can be required for deleting versioned buckets Signed URLs have a URL valid only for a limited time
S3 websites Host static websites URL: my-bucket.s3-website.aws-region.amazonaws.com or my-bucket.s3-website-aws-region.amazonaws.com Allow public reads to avoid 403 errors CORS allows you to request data from other S3 so other websites can't access (reducing your AWS cost!)
Consistency Model GET 404 -> PUT 200 -> GET 404 ~> GET 200 DELETE 200 -> GET 200 PUT 200 -> PUT 200 -> GET 200 (from the 1st)
MFA-Delete Requires: enabled versioning in the bucket and AWS CLI be the bucket owner Allow to: permanently delete an object version & suspend versioning Not required for: enabling versioning, listing deleted versions
Bucket policies evaluated before default encryption Log any request made to S3, from any account, allowed or denied into another S3 Cross replication must enable versioning and it's async Pre-signed URL: For downloads you need the cli and for uploads you need the sdk
CloudFront It's a CDN, improves read performance and cache content at the edge can provide SSL encryption supports RTMP protocol (videos/media) Signed URL needs the SDK to generate the URL (shared content should last few mins and private content can last year)
S3 Tiers S3 Standard - GP High availability and durability S3 Standard-Infrequent Access - IA Lower cost than GP (Good for backups) S3 One Zone-Infrequent Access Lower cost than IA and supports SSL at transit and encryption at rest S3 Reduced Redundancy Storage (deprecated) S3 Intelligent tiering small monthly monitoring and auto-tiering fee Glacier Cost is storage/month + retrieval cost, each item is called archive and are stored in vaults
Lifecycle rules Transition actions: Defines when objects are transitioned to another storage class Expiration actions: S3 can delete expired items after a configured time.
Snowball physical data transport solution that helps moving TBs or PBs of data in or out of AWS Uses KMS 256 bit encryption You need to request the device, install the client and ship back the device Edge allows computational capability (process on the go with EC2 AMI or lambda functions) Snowball mobile is great for 10PB or more.
Storage Gateway Bridge between on-premise data and cloud data in S3 (used in between app and S3) File Gateway accessible with NFS and SMB protocols most recent data is cached can be mounted on many servers file access Volume Gateway accessible with iSCI protocol from S3 backed by S3 and EBS volumes most recent data is cached Volume / Block storage Tape Gateway accessible with iSCI protocol from S3 uses Virtual Tape Library backed by Glacier and S3 backup
Athena Serverless service to analyse data directly on S3 Charge for query (SQL) and data scanned
SQS - Queue Model
Default retention is 4days (up to 14) Unlimited messages (body up to 256kb) Low latency Can be duplicated or out of order Delay message up to 15mins (default 0s) can poll up to 10 messages (message become invisible) have a configurable visibility timeout Create a Dead Letter Queue designate it as DLQ and apply the redrive policy
FIFO available if queue name ends with ".fifo" allow de-duplication and messages sent once messages groups with only an extra tag
SNS - Pub/Sub model
Producer sends message to one topic To pub you need the SDK, create topic, create sub and publish to the topic SNS + SQS for Fan Out (No data loss)
Kineses - Real time streaming model
alternative to kafka great for logs, iot, metrics, big data
Kinesis Streams Data retention is 1 day (up to 7days) can replay data once data is inserted can not be deleted (immutability) billing per shard One stream can have many different shards and ordered by shard
Kinesis API Put records Messages sent get a sequence number try to prevent hot partition same key go to same partition use batching to reduce cost avoid hot shard
Security Use IAM policies Encryption in flight with https and at rest with KMS
Firehouse near real time fully managed (no UI) automatic scaling pay for the amount of data going through (and conversion format) load into Redshift / S3 / ES / Splunk
Amazon MQ Apache ActiveMQ doesn't scale has queue feature (SQS) and topic feature (SNS) runs on dedicated machine
Configuration Timeout 3 secs (up to 900secs) env vars (size 4KB max) allocated memory (128mb to 3gb) deploy within a VPC IAM execution role must be attached to the lambda function disk capacity in /tmp is 512mb concurrency limit is 1000 deployment size compressed is 50MB and uncompressed is 250MB
DynamoDB
Here's a quick cheat-sheet to remember all these services:
CodeCommit: service where you can store your code. Similar service is GitHub
CodeBuild: build and testing service in your CICD pipelines
CodeDeploy: deploy the packaged code onto EC2 and AWS Lambda
CodePipeline: orchestrate the actions of your CICD pipelines (build stages, manual approvals, many deploys, etc)
CloudFormation: Infrastructure as Code for AWS. Declarative way to manage, create and update resources.
ECS (Elastic Container Service): Docker container management system on AWS. Helps with creating micro-services.
ECR (Elastic Container Registry): Docker images repository on AWS. Docker Images can be pushed and pulled from there
Step Functions: Orchestrate / Coordinate Lambda functions and ECS containers into a workflow
SWF (Simple Workflow Service): Old way of orchestrating a big workflow.
EMR (Elastic Map Reduce): Big Data / Hadoop / Spark clusters on AWS, deployed on EC2 for you
Glue: ETL (Extract Transform Load) service on AWS
OpsWorks: managed Chef & Puppet on AWS
ElasticTranscoder: managed media (video, music) converter service into various optimized formats
Organizations: hierarchy and centralized management of multiple AWS accounts
Workspaces: Virtual Desktop on Demand in the Cloud. Replaces traditional on-premise VDI infrastructure
AppSync: GraphQL as a service on AWS
SSO (Single Sign On): One login managed by AWS to log in to various business SAML 2.0-compatible applications (office 365 etc)