brijesh-deb/AWS_Sol_Arch_Associate.md Secret

## AWS_Sol_Arch_Associate.md

      
    Raw
  

              AWS_Sol_Arch_Associate.md
            
          
    1. Introduction

2. S3 and Glacier Storage

3. EC2 and EBS

4. AWS VPC

5. ELB, CloudWatch, Auto Scaling

6. Identity and Access Management

7. Databases and AWS

8. SQS, SWF, SNS

9. Amazon Route 53

10. Amazon ElastiCache

11. Other Key Services

12. Security on AWS

13. AWS Risk and Compliance

14. Architecture Best Practices

15. Miscelleneous

1. Introduction to AWS


Cloud deployment models: 1) All-In 2) Hybrid
Each Region is a separate geographic area. Each region has multiple isolated locations known as Availability Zones.
Resources are not replicated across regions unless organizations choose to do so.
AZs are physically separated within a typical metropolitan region and located in lower-risk flood plains, connected through low-latency links.
AWS shares security responsibilities with organizations. AWS manages the underlying infrastructure and the organization can secure anything that it deploys in AWS.
Partial list of certifications and standards with which AWS complies:

Service Organization Control (SOC)1, SOC 2, SOC 3
Federal Information Security Management Act (FISMA)
Payment Card Industry Data Security Standard (PCI DSS)
International Organization for Standardization (ISO) 9001, ISO 27001 and ISO 27018


Organizations have complete control over VPC, including selection of IP address range, creation of subnets, configuration of route tables and network gateways.
AWS Direct Connect allows organizations to establish a dedicated network connection from their data centers to AWS.
AWS Route 53 is a highly available and scalable Domain Name System (DNS) web service.
AWS Storage Gateway is a service connecting an on-premise software application with cloud based storage.
AWS RedShift is a data warehouse service.
AWS Lambda is a zero-administration compute platform for back-end web developers that runs code on AWS cloud and provides a fine-grained pricing structure.
AWS Elastic Beanstalk enables developers simply upload application code and the service automatically handles all details including resource provisioning, load balancing, auto scaling and monitoring.

2. S3 and Glacier Storage 

Amazon S3


S3 is an object storage with simple web interface
Allows to pay only for the storage that you actually use
If "Requester Pays" is enabled

requester pays for data download from bucket
owner always pays for storing data
anonymous access to that bucket is disabled


"Transfer Acceleration"

is used to acclerate data upload into S3 bucket.
uses CloudFront's globally distributed AWS Edge location to route to S3 bucket over an optimized network path.
data transfer application must use one of the following two types of endpoints to access the bucket for faster data transfer: .s3-accelerate.amazonaws.com or .s3-accelerate.dualstack.amazonaws.com for the “dual-stack” endpoint.
You are charged only if there was a benefit in transfer times


Nearly any application running on AWS uses S3 directly or indirectly. Like:

durable target storage for Kinesis, Elastic MapReduce
storage for EBS and RDS snapshots
data staging or loading storage for RedShift and DynamoDB.


Use cases for S3: backup and archive, big data analytics, static web hosting, disaster recovery.
S3 lifecycle policies can help data automatically migrate to appropiate storage class without application code modification.
Object storage vs Block Storage and File storage

Block storage(EBS) operates at lower level- row storage device level. Like Storage Area Network (SAN). Block storage is accessed over network using protocol such as iSCSI or Fibre Channel.
File storage(EFS) operates at higher level- operating system level. Like Network Attached Storage(NAS). File storage is accessed over network using protocol such as Common Internet File System (CIFS) or Network File System (NFS).
S3 is a cloud object storage; independent of server and accessed over the internet.
EBS provides block level storage for EC2 instances; EFS provides network attached file storage using NFS v4 protocol.


Objects are entities or files stored in S3

can store virtually any kind of data in any format.
Size upto 5 TB.
Consists of data (file itself) and metadata(data about the file). The data portion of S3 is opaque to S3; AWS doesnt know or care what type of data you are storing.
2 types of metadata: system and user.
System metadata is used by S3 itself; like date last modified, object size, MD5 digest.
User metadata is optional, can be used to tag data with attributes meaningful to you.
Each object is identified by a key. Key must be unique with a single bucket, but different buckets can contain objects having same key.
Combination of bucket, key and optional version ID uniquely identifies an S3 object.
Each object can be addressed by a unique URL.
In URL http://mybucket.s3.amazonaws.com/data/reports/finance.doc; mybucket is bucketname and data/reports/finance.doc is object key
AWS S3 provides 2 styles of paths for accessing S3 objects

In a virtual-hosted–style:

http://bucket.s3.amazonaws.com
http://bucket.s3-aws-region.amazonaws.com


In a path-style URL :

http://s3.amazonaws.com/bucket
http://s3-aws-region.amazonaws.com/bucket


virtual-hosted–style is recommended. When using the virtual hosted style you will have your files at the root directory. This means you are accessing the top level of what is basically a file structure for your domain which is handy when dealing with some applications that will search your domain at this root level for certain things such as a favicon


Objects stored in a bucket will never leave the region in which they are stored unless you move them to another region or enable cross-region replication is recommended. when using the virtual hosted style you will have your files at the root directory. This means you are accessing the top level of what is basically a file structure for your domain which is handy when dealing with some applications that will search your domain at this root level for certain things such as a favicon
S3 API is simple and includes handful of operations

create/delete a bucket
write an object
read an object
delete an object
list keys in a bucket


Bucket: Objects reside in a container called Bucket

simple file folder with no file system hierarchy; no sub-bucket.
Can hold unlimited no of objects.
bucket names are global, has to be unique across all AWS accounts.
Upto 100 buckets per account by default.
Created on a specific Region that you choose, gives control on where to store the data.
You cannot create nested buckets
Bucket names cannot be changed after they have been created


Each object is identified by a unique user-specified key(filename)
Cannot mount a bucket, open a file, install an OS or run database in S3
Objects are automatically replicated across multiple AZs in a region.
Use HTTPS for S3 API requests to ensure secure access.
Durability and availability are slighly diffent. Durability - "my data will be there in future?" while Availability - " can i access my data right now?"
Availability: 99.99 (Standard), 99.9 (IA), 99.5 (OneZone IA), 99.99 (RRS)
Durability: 11 nines (Standard), 11 nines (IA), 11 nines (OneZone IA), 99.99 (RRS)
If u dont this high durability, u can opt for Reduced Redundancy Storage (RRS) at a lower cost. RRS offers 99.99% durability.
Accidental deletion or overwrite can be prevented using - versioning, MFA delete and cross-region replication.
S3 is eventually consistent; following operations may return old data

PUT new data to an existing key, then GET
DELETE an object, then GET


But offers read-after-write consistency for new object PUTs.
Access control

By default, only u can access bucket, object created by u.
By default a bucket, its objects, and related sub-resources are all private
By default only a resource owner can access a bucket
The resource owner refers to the AWS account that creates the resource
With IAM the account owner rather than the IAM user is the owner
S3 provides both

coarse-grained access controls (S3 ACLs)
fine-grained access control (S3 bucket policies, IAM policies, query-string authentication)


ACLs provides coarse grained permissions at object or bucket level; legacy approach, not recommended.
Recommanded appoach is to use bucket policies.
using bucket policy u can specify: who can access bucket, from where (CIDR or IP address) and when (time of day).
Bucket policy include an explicit reference to IAM Principal in the policy. This principal can be associated with a different AWS account, so S3 bucket policies allow you to assign cross-account access to S3 resources.
You can use the AWS Policy Generator to create a bucket policy for your Amazon S3 bucket


A common use case of S3 storage is static website hosting. Since each S3 object has an URL, it is easy to convert a bucket into a website.

Create a bucket with the same name as desired website hostname.
Upload static files to the bucket.
Make all files public.
Enable static website hosting for the bucket. This includes specifying an Index document and an Error document.
The website will now be available at the S3 website URL: .s3-website-.amazonaws.com
Create a friendly DNS name in your own domain for the website using DNS CNAME or an Amazon Route 53 alias that resolves to the Amazon S3 website URL.


Prefixes and delimiters

used to organize, browse and retrieve object in a bucket hierarchicallly.
ex: reports/2018/jan/sales.pdf


Storage classes

Classes

Standard: For general purpose, S3 standard is the place to start.
Standard - Infrequent access (Standard-IA)

Same durability, low latency and high throughput as S3 standard, but is designed for long-lived less frequently accessed data.
Has a lower per GB-month storage cost than Standard, but includes a min object size(128 KB), min duration (30 days) and per-GB retrieval cost
Best suited for infrequently accessed data that is stored for longer than 30 days


Reduced Redundancy Storage (RRS)

lower durability (4 nines, 99.99%)
Best suited for derived data that can be easily reproduced such as image thumbnail.
Should not be used for critical data.


Glacier

It takes 3 to 5 hrs to retrieve data from Glacier; object is copied to S3 RRS. Original object still remains in Glacier until explicitly deleted.
Allows to retrive upto 5% of S3 data stored in Glacier for free each month; restores beyond the daily restore allowance incur cost.


Object Lifecycle Management

Using S3 lifecycle management rules, u can significantly reduce ur storage costs by automatically transitioning data from 1 storage class to another or even automatically deleting after a period of time.
Lifecycle rules are attached to the bucket and can apply to all objects in the bucket or only to objects specified by a prefix.
You need to enable bucket versioining to manage S3 lifecycle policies.


Encryption

For data in flight, use SSL API endpoints.
For data in rest

Server Side Encryption(SSE): S3 encrypts data at object level as it writes it to disks at data center and decrypts when u access it. Keys: SSE-S3(AWS-Managed key), SSE-KMS (AWS KMS keys), SSE-C (Customer provided keys)
All SSE use 256-bit AES (Advanced Encryption Standard)
Client Side Encryption(CSE) : encrypt data before sending to S3


Versioning

protects data from accidental or melicious delete
Object can be restored by referencing the version ID in addition to bucket and object key.
versioning is turned on at bucket level
Buckets can be in one of three states: unversioned (the default), versioning-enabled, or versioning-suspended. Once you version-enable a bucket, it can never return to an unversioned state. You can, however, suspend versioning on that bucket.
When you try to delete an object with versioning enabled a DELETE marker is placed on the object
You can delete the DELETE marker and the object will be available again
Objects that existed before enabling versioning will have a version ID of NULL


MFA Delete

Add another layer of data protection on top of bucket versioning.
Required additional authentication in order to permanently delete an object version or change the versioning state of a bucket
Can only be enabled by root account.


Pre-signed URL

object owner can share objects by creating a pre-signed URL, using their own security credentials.
valid only for specified duration.
useful to protect against "content scraping" of web content such as media files stored in S3


Multi-part upload API

supports upload or copy of large objects.
Multipart upload is a 3 step process: initiation, uploading the parts and completion(or abort)
Once all parts are uploaded, S3 assembles the parts to create the object.
u should use multipart upload for object larger than 100 MB and must for objects larger than 5 GB
U can set a lifecycle policy on a bucket to abort incomplete uploads after a specified no of days.


Range GETs

GET only a portion of an object from S3 or Glacier; use Range HTTP header in GET request
Useful to deal with large objects when u have poor connectivity or downloading known portion of Glacier.


Cross-region replication

asynchronously replicates all object in source bucker in a region to target bucket in another region.
All metadata and ACLs associated with objects are also replicated.
If enabled on an existing bucket, only new objects will be replicated
To enable cross-region replication

enable versioning for both source and target bucket
use IAM policy to give S3 permission to replicate objects


Access Log

S3 server access logs can be enable to track requests to S3 bucket.
Disabled by default, can be enabled at bucket level.
Must specify target bucket to store the logs.


Event notification

Set at bucket level
Can notify when new objects are created, objects are removed or when S3 detects an RRS object was lost.
possible event notifications: SNS, SQS, Lambda function


S3 will scale automatically to support very high request rates, automatically re-partitioning your buckets as needed.
For GET intensive use cases (like static website hosting) use CloudFront distribution as a caching layer in front of S3.
For very read-intensive use cases with high request rates, performance and scalability of S3 can be improved by using randomness in the namespace by including a hash prefix to key names.
Where is S3 data stored?

Stored in min 3 AZs in the region where S3 is created.
Only for One-Zone IA stored redundantly in single AZ


Query in Place

S3 Select

an Amazon S3 feature that makes it easy to retrieve specific data from the contents of an object using simple SQL expressions without having to retrieve the entire object.
You can use S3 Select to retrieve a subset of data using SQL clauses, like SELECT and WHERE, from objects stored in CSV, JSON, or Apache Parquet format
Also works with objects that are compressed with GZIP or BZIP2 (for CSV and JSON objects only), and server-side encrypted objects


Amazon Athena

Query service that makes it easy to analyze data in Amazon S3 using standard SQL queries.
Don’t even need to load your data into Athena, it works directly with data stored in any S3 storage class


RedShift Spectrum

A feature of Amazon Redshift that enables you to run queries against exabytes of unstructured data in Amazon S3 with no loading or ETL required.


S3 Intelligent-Tiering

S3 storage class for data with unknown access patterns or changing access patterns that are difficult to learn.
delivers automatic cost savings by moving objects between two access tiers when access patterns change
One tier is optimized for frequent access and the other lower-cost tier is designed for infrequent access.


Storage class analysis: With Storage Class Analysis, you can analyze storage access patterns and transition the right data to the right storage class. This new S3 feature automatically identifies infrequent access patterns to help you transition storage to S3 Standard-IA
S3 inventory

S3 Inventory to provide a CSV, ORC, or Parquet file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or prefix.
use S3 inventory to verify encryption and replication status of your objects to meet business, compliance, and regulatory needs


Batch Operations: S3 Batch Operations is a feature that customers can use to automate the execution, management, and auditing of a specific S3 API requests or AWS Lambda functions across many objects stored in Amazon S3 at scale.
Object Lock:

a new Amazon S3 feature that blocks object version deletion during a customer-defined retention period so that you can enforce retention policies as an added layer of data protection or for regulatory compliance
S3 Object Lock protection is maintained regardless of which storage class the object resides in and throughout S3 Lifecycle transitions between storage classes
enables you to store objects using a "Write Once Read Many" (WORM) model.


Data transfer charges

No charge for data transferred between EC2 and S3 in the same region
Data transfer into S3 is free of charge
Data transferred to other regions is charged


Amazon Glacier


Suitable for "cold data".
In most cases data stored in Amazon Glacier consists of large TAR (Tape Archive) or ZIP files.
Designed for 11 nine durability over a given year.
Can be used both as a storage class for S3 and an independant archival storage service.
Archives

data stored in archives
can contain upto 40 TB, unlimited number of archieves.
assigned a system generated unique archive ID at time of creation.
are automatically encrypted, and are immutable - cannt be modified after creation.


Vaults

container for archives, max 1000 vaults per account.
once locked vault policy cannt be changed.


Vault Lock Policy can be used to enforce compliance control at vault level; like Write Once Read Many(WORM)
Vault Access Policy

a resource-based policy that you can attach directly to your S3 Glacier vault (the resource) to specify who has access to the vault and what actions they can perform on it
Vault Lock policy can be made immutable and provides strong enforcement for your compliance controls then Vault Access Policy


upto 5% of data can be retrieved free for each month; charged beyond that.
Retrieval Options

Expediated: largest archive of 250 MB, retrieval time of 1-5 min (most expensive)
Standard: retrieval time of 3-5 hours
Bulk: retrieval time of 5-12 hours (cheapest, use for large quantities of data)


Following retrieval you have 24 hours to download your data
Uploading archives is synchronous
Downloading archives is asynchronous
Retrieval policies

let you define your own data retrieval limits
Retrieval policies apply to Standard retrievals.
Types

Free Tier: retrievals within your daily free tier allowance and not incur any data retrieval cost
Max Retrieval Rate:control the peak retrieval rate by specifying a data retrieval limit that has a bytes-per-hour maximum
No Retrieval Limit: all valid data retrieval requests accepted, data retrieval costs will vary based on your usage


In Free and Max type, Glacier will not accept retrieval requests that would exceed the retrieval limits you defined.


Glacier Select: a feature that allows you to run queries on your data stored in Amazon S3 Glacier, without the need to restore the entire object to a hotter tier like Amazon S3
AWS recommends that if you have lots of small objects they are combined in an archive (e.g. zip file) before uploading
There is a charge if you delete data within 90 days
Glacier vs S3

Glacier supports 40 TB archive while S3 support 5 TB Object
Glacier archives are identified by system generated archive IDs, while user can provide Object key.
Glacier archives are automatically encrypted, while encryption at rest is optional in S3


3. EC2 and EBS

EC2


There are 2 concepts key to launching AWS instances

Amount of virtual H/W allocated to the instance: Instance Type
S/W installed on the instance: AMI


Instance varies on following dimensions: 1) virtual CPUs 2) Memory 3) Network performance 4) Storage- size and type
m4 family provides a balance of all dimensions.
Instance families (5)

General Purpose
compute optimized: for heavy processing like batch processing
memory optimized: cache, in-memory database etc.
storage optimized: for workload requiring fast SSD storage
Accelerated computing(GPU based) instances: for graphic and general purpose GPU compute workloads


Each of these families have multiple instance types like

General Purpose (T2, M3, M4)
compute optimized (C3, C4)
memory optimized: (X1, R3, R4)
storage optimized: (I3, D2)
Accelerated computing(GPU based) - P2, G3, F1


For workloads requiring greater network performance, many instances type support enhanced networking. Enabling enhanced networking on an instance involves ensuring correct drivers are installed and modifying an instance attribute.
Supports enhanced networking capabilities using SR-IOV. Pre-requisite for enhanced networking

Instances must be launched from a HVM AMI
Instances must be launched in a VPC


Features of enhanced networking

More packets per second
low latency
less jitter


AMI defines aspects of software state at instance launch including:

template of root volume for instance (OS, application or system software, initial state of patches)
Launch permissions that control which AWS accounts can use the AMI to launch instances
A block device mapping that specifies the volumes to attach to the instance when it’s launched


Sources of AMI:

Published by AWS
AWS marketplace
Generated from existing instances
Upload virtual servers.


Instances can be accessed over the web using

Public DNS

generated automatically and cannt be specified by user
persists only when instance is running
cannt be transferred to another instance


Public IP

assigned from addresses reserved by AWS and cannt be specified by user
persists only when instance is running
cannt be transferred to another instance


Elastic IP

Has to be allocated and then assigned to an instance
Persists until customer releases it and not tied to the lifetime of an individual instance
can be transferred between instances, can be shared externally without coupling clients to a particular instance


EC2 uses public key cryptography to encrypt/decrypt login information.
2 keys together is called key pair, AWS stores public key and private key (.pem file) is kept with customer.
Store the private key securely.
IAM Role

IAM roles can be attached, modified, or replaced at any time
Only one IAM role can be attached to an EC2 instance at a time
IAM roles are universal and can be used in any region


when an EC2 instance is stopped/started

EIP remain associated with the instance
Underlying host computer will be changed


Security Group

controls traffic based on port, protocol, and source/destination.
Have diff capabilities depending on whether they are associate with Amazon VPC or Amazon EC2-classic

EC2-classic security group: controls only outgoing instance traffic
VPC security group: controls both outgoing and incoming instance traffic


is default deny, doesnt allow any traffic that is not explicitly allowed
when an instance is associated with multiple security group, the rules are aggregated and all traffic allowed by each is allowed.
is a stateful firewall, an outgoing message is remembered so that the response is allowed without any explicit inbound rule.
applied at instance level, as opposed to traditional on-premise firewall. So instead of having to breach s single perimeter to access all instances, an attacker have to breach security for each instance.


Bootstrapping

The process of providing code to be run on an instance at launch is called bootstrapping.
One of the parameter when an instance is launched is a string value called UserData. Stored with the instance and is not encrypted, so it is important not to include any secret like password.
Userdata can be script to perform tasks like

Applying patch and update on OS
installing application software
enrolling in a directory service
installing Chef or Pupprt and assigning the instance a role so the configuration management software can configure the instance


VM export/import

allows import of VMs from existing environment as an EC2 instance and export them to on-premise environment.
Only imported EC2 instances can be exported, instances launched within AWS from AMI cannt be exported.
Can import existing virtual machines as

EC2 instances
AMIs


Instance Metadata is data (like instance id, instance type, security groups) about the instance that can be used to control or manage running instance. Available at http://169.254.169.254/latest/meta-data/
Followings aspects of an instance can be modified after launch

Instance Type: stop the instance, change instance type and restart the instance
Security Groups: If an instance is running in VPC, u can be change associated security groups while instance is running. For EC2-classic, associate security groups cann't be changed once launched.


Accidental termination of instances can be prevented by enabling Termination Protection. With this instance cannt be terminated until Termination Protection is disabled.
Pricing options

On-Demand:

most flexible without any up-front commitment, least cost effective.
This is the price per hour published in AWS website.


Reserved:

Are of 2 types:
i) Standard reserved ( commitment of 1 0r 3 years)
ii) Scheduled reserved (accure charges hourly but billed monthly, 1 year commitment)
Needs commitment of 1-3 years
when purchasing customer specifies instance type and AZ. Cost based on term commitment and payment option
Saves upto 75% of on-demand hourly rate.
Standard reserved has 3 payment options: All upfront, partial upfront, no upfront


Spot:

for workloads that are not time critical and tolerant to interruption, most cost efficient.
cannt use encrypted volume


Reservation can be modified in one or more following ways

switch AZ with same region
change between EC2-VPC and EC2-classic
change instance size within same instance type( Linux only, windows not possible)
Instance type changes is allowed only for Linux and not for windows. Linux RIs cannt be changed to RedHat or SUSE.


Spot instances will run until

customer terminates
spot price goes above customer's bid price
not enough capacity to meet demand


when AWS needs to terminate a Spot instance, a 2 minute warning is provided.
Tenancy options

Shared: default; single host machine can have instances from different customers.
Dedicated instance: instances run on hardware that dedicated to a single customer.
Dedicated host: instances run on a specific hardware dedicated to a single customer.


Placement Groups

logical grouping of instances with a single AZ.
applications can use a low latency 10 Gbps network.
to fully use this choose instance types that supports enhanced networking and 10 Gbps network performance.
Types:

Cluster: cluster placement group is a logical grouping of instances within a single Availability Zone
Spread: spreads instances across underlying hardware (can span AZs)


Instances within a placement group can communicate with each other using private or public IP addresses. Best performance is achieved when using private IP addresses
Spread placement groups are not supported for Dedicated Instances or Dedicated Hosts
Need to provision the number of instances you need at one time
Cannot merge placement groups
Cannot move existing instances into a placement group


Monitoring

EC2 status checks are performed every minute and each returns a pass or a fail status
System status checks detect (StatusCheckFailed_System) problems with your instance that require AWS involvement to repair
Instance status checks (StatusCheckFailed_Instance) detect problems that require your involvement to repair
Status checks are built into Amazon EC2, so they cannot be disabled or deleted


Instance store or Ephemeral storage

provides temporary block level storage to EC2 instances
located on disks that are physically attached to host
cost included in cost of EC2 instance, so very cost effective
Data in instance store is lost when

underlying disk drive fails
instance stops (data will persists if instance reboots)
instance terminates


EC2 billing

Instance usages are billed for any time your instances are in a "running" state.
If you no longer wish to be charged for your instance, you must "stop" or "terminate" the instance to avoid being billed for additional instance usage.


EC2 Compute Unit (ECU): Amazon EC2 uses a variety of measures to provide each instance with a consistent and predictable amount of CPU capacity. In order to make it easy for developers to compare CPU capacity between different instance types, we have defined an Amazon EC2 Compute Unit. The amount of CPU that is allocated to a particular instance is expressed in terms of these EC2 Compute Units.
IP addresses release

Private IPv4 address remains associated with the network interface when the instance is stopped and restarted, and is released when the instance is terminated.
Instance's public IP address is released when it is stopped or terminated. Your stopped instance receives a new public IP address when it is restarted.
Instance's public IP address is released when you associate an Elastic IP address with it. When you disassociate the Elastic IP address from your instance, it receives a new public IP address.
On Reboot, both Private and Public IP address is retained.


Dont need Elastic IP Address for every instance. Private and Public IP address are adequate for many applications where you do not need a long lived internet routable end point. Compute clusters, web crawling, and backend services are all examples of applications that typically do not require Elastic IP addresses.
Windows EC2 instances doesnt work with EFS
Data transfer across regions: Data between instances in different regions is charged (in and out)
Data transfer within region: Regional Data Transfer rates apply if at least one of the following is true, but are only charged once for a given instance even if both are true:

The other instance is in a different Availability Zone, regardless of which type of address is used
Public or Elastic IP addresses are used, regardless of which Availability Zone the other instance is in


All EC2 instances are assigned a Private IP, but Public IP is assigned only for instances in public subnets(VPC)
Eth0 is the primary network interface and cannot be moved or detached. By default Eth0 is the only Elastic Network Interface (ENI) created with an EC2 instance when launched
Non-root volumes can be encrypted
Root volumes can be encrypted if the instance is launched from an encrypted AMI

EBS


volume is automatically replicate within same AZ [and not across AZ]
multiple volumes can be attached to single EC2 instance, but single volume can be attached to a single EC2 instance at a time.
EBS volumes must be in the same AZ as the instances they are attached to
Termination protection is turned off by default and must be manually enabled (keeps the volume/data when the instance is terminated)
Root EBS volumes are deleted on termination by default
Extra non-boot volumes are not deleted on termination by default
The root device is created under /dev/sda1 or /dev/xvda
Throughput optimized EBS volumes cannot be a boot volume
Each instance that you launch has an associated root device volume, either an Amazon EBS volume or an instance store volume
When rebooting the instances for both types (instance and ESB) data will not be lost
Volume categories

SSD(Solid State Disk) backed volumes: optimized for transactional workloads involving frequent read/write operations with small I/O size, where key performance attribute is IOPS. Use cases: database workload, boot volumes and workload that need high IOPS.
HDD(Hard Disk Drive) backed volumes: optimized for large streaming workloads where throughput is key performance attribute and not IOPS. Use cases: big-data workloads, large I/O sizes and sequential I/O patterns.


Types of volume

General purpose SSD (Solid State Disc)

Performance: 3 IOPS per GB, capping at 10,000 IOPS
billed based on space provisioned, regardless of data actually stored.
use cases: small/medium database, dev and test environment


Provisioned IOPS SSD

size: 4 GB to 16 TB;
while provisioning besides mentining size, u need to specify desired no of IOPS upto lower of max of 30 times of GB of volume or 20000 IOPS.
Pricing is based on size of volumen and IOPS reserved.
use cases: large RDBMS/NoSQL database, critical business applications


Cold HDD

Use cases: throughput oriented for large volume of data, less frequently accessed


Throughput Optimized HDD

Use cases: Big Data, Streaming, Logs


Magnetic [old generation; not used much]

lowest performance, lowest cost
size: 1 GB to 1 TB; performance: 100 IOPS average
billed based on space provisioned, regardless of data actually stored.
use cases: sequential reads, workload where data is accessed infrequently


EBS-optimized instance

uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
When EBS-optimized instance is used, you pay an additional hourly charge
Available for select instance types


EBS volumes are resizable, you can change the size even when they are attached to an instance. Instance has to be stopped before resize.
Volumes that are created from encrypted snapshots are automatically encrypted
Snapshots

incremental backups, changes since most recent snapshots are saved.
snapshot data is stored in S3; action of taking snapshot is free, only storage cost for data.
Deleting a snapshot removes only the data not needed by any other snapshot
when snapshot is requested, the point-in-time snapshot is created immediately and volume can continue to be used. The snapshot may remain in pending status until all modified blocks is transferred to S3.
while snapshots are stored in S3, they are AWS-controlled storage and not in ur account's S3 bucket. This means u cannt manipulate them like other S3 objects.
snapshots are constrainted to the region in which they are created, new volumens can be created from them in the same region. If snapshot is needed in any other region, a copy can be created to the target region.
When a volume is created from snapshot, the volume is created immediately but data is loaded lazily. Volumes can be accessed upon creation, and if data being requested has not yet been restored, it will be restored upon first request. Best practice is to initialize a volume created from a snapshot by accessing all the blocks in volume.
snapshots can be used to increase the size of EBS volume. Take a snapshot of the volume, create a new volumen of desired size from the snapshot and then replace original volume with new volume.
snapshots that are taken from encrypted voumes are automatically encrypted, as are volumes that are created from encrypted snapshots.
Snapshots can only be accessed through the EC2 APIs
EBS volumes are AZ specific but snapshots are region specific
Snapshots can be taken of non-root EBS volumes while running
To take a consistent snapshots writes must be stopped (paused) until the snapshot is complete – if not possible the volume needs to be detached, or if it’s an EBS root volume the instance must be stopped


If DeleteOnTermination flag is set to true, volume should be detached before instance is terminated. This volume can then be attached to another instance and data recovered.
Security: EBS encryption enables data at rest security by encrypting your data using Amazon-managed keys, or keys you create and manage using the AWS Key Management Service (KMS). The encryption occurs on the servers that host EC2 instances, providing encryption of data as it moves between EC2 instances and EBS storage.
Encryption

Snapshots of encrypted volumes are encrypted automatically
You can share snapshots, but if they’re encrypted it must be with a custom CMK key
You cannot make encrypted snapshots public


4. Amazon VPC


custom defined virtual network with AWS cloud.
Default limit of VPCs that a customer may have in a region is 5.
customer control various aspects of VPC including selecting IP address range, creating subnets, route tables, network gatways,security setting.
U can create multiple VPC within same Region
While creating VPC, u must specify IPv4 address range by choosing CIDR block, address range cannt be changed once VPC is created.
VPC address range may be as large as /16 and as small as /28.
VPC service was released after EC2, because of this there are 2 different network platforms available: EC2-classic and EC2-VPC.
AWS account that support EC2-VPC has a default VPC created in every Regions with default subnet created in each AZ.
VPC consists of following components: 1) Subnets 2) Route Table 3) DHCP 4) Security group 5)Network ACLs
VPC has following optional components: 1) Internet Gateway 2) EIP 3) Elastic Network Interface 4) Endpoints 5) Peering 6) NAT instance and NAT Gateway 7) Virtual Private Gateway(VPG), Customer Gateway (CGW) and Virtual Private Network (VPN)
Subnets

A segment of VPC's IP address where you can launch resources like EC2 instances, RDS databases.
AWS reserves first 4 IP address and last IP address for internal networking purposes.
Subnets resides inside a AZ, cannt span AZs.
Smallest subnet that u can create is /28(16 IP addresses)
Types of subnet

Public: associated route table directs traffic to VPC IGW
Private: associated route table doesnt direct traffic to VPC IGW
VPN-only: associated route table directs traffic to VPC VPG (Virtual Private Gateway) and not IGW.


Default VPCs contain 1 public subnet in each AZ within the region, with netmask of /20.
The default VPC has all-public subnets


VPC Wizrd

VPC with a single public subnet
VPC with Public and Private subnets
VPC with Public, Private subnets and Hardware VPN access
VPC with Private subnet only and Hardware VPN access


Egress-only Internet Gateway: A stateful gateway to provide egress only access for IPv6 traffic from the VPC to the Internet
VPC Flow Logs

Flow Logs capture information about the IP traffic going to and from network interfaces in a VPC
Flow log data is stored using Amazon CloudWatch Logs
Flow logs can be created at the following levels: VPC, Subnet, Network Interface


Route Tables

Logical construct with an VPC that contains set of rules that determines where network traffic is directed.
Contains a default route called local route that enables communication within VPC; this route cannt be modified or removed.
Each VPC has an implicit router and a main route table; u can create additional custom route tables in ur VPC.
If u dont associate a subnet with a route table, the subnet uses the main route table.
Each route in a routing table specifies a destination CIDR and a target; for example traffic destined for the external corporate network 172.16.0.0/12 is targeted for the virtual private gateway
AWS uses the most specific route that matches the traffic to determine how to route the traffic.
Each subnet can only be associated with 1 route table.


Internet Gateway

Enables communication between instances in VPC and internet.
1:1 relation between VPC and IGW
To create a public subnet with internet access

attach a IGW to VPC
create a subnet route table rule to send all non-local traffic to IGW.
configure NACLs and security group to allow relevant traffic to flow to and from the instance


When traffic sent from instance to internet, IGW translates private address to instance's public IP address (or EIP address). When instance receives traffic from internet, the IGW translates the destination address(public IP address) to instance's private IP address and forwards the traffic to VPC instance.


Dynanic Host Configuration Protocol(DHCP) Option Sets

DHCP provides a standard for passing configuration information to hosts on TCP/IP network. Options field of a DHCP message contains  the configuration parameters.
Allows to direct EC2 host name assignment to ur own resources.
Some of these paramaters are the domain name, domain name server.
DHCP options sets are associated with your AWS account so that you can use them accross all of your VPC.
When a VPC is created, automatically a set of DHCP options are created and associated with the VPC.This set includes two options: domain-name-servers=AmazonProvidedDNS and domain-name=domain-name-for-your-region.
When you launch an instance into a VPC, we provide the instance with a private DNS hostname, and a public DNS hostname if the instance receives a public IPv4 address. If domain-name-servers in your DHCP options is set to AmazonProvidedDNS, the public DNS hostname takes the form ec2-public-ipv4-address.compute-1.amazonaws.com for the us-east-1 region, and ec2-public-ipv4-address.region.compute.amazonaws.com for other regions.
Each VPC must have only 1 DHCP option set assigned to it.


Elastic IP Addresses(EIPs)

AWS maintains a pool of static, public IP address in each region and make them available to be associated with resources within VPC.
EIPs are region specific (an EIP in one region cannnt be assigned to an instance with an VPC in another region)
U must first allocate EIP within a VPC and then assign it to an instance.
One-to-one relationship between EIPs and network interfaces.
U can move EIPs from one instance to another, either in same VPC or different VPC with same region.


Elastic Network Interfaces(ENIs)

Virtual network interface that can be attached to an instance in a VPC.
Associate with a subnet upon creation.
Can have 1 public IP address and multiple private IP addresses. In case of mulitple private IP addresses, one is primary.
Assigning a second network interface to an instance via ENI allows it to be dual-homed(have network presence in different subnets)
ENIs helps create a management network, use network and security appliances in VPC, create dual-homes instances or create a low-budget, high-availability solution.
You can attach a network interface to an EC2 instance in the following ways:

When it's running (hot attach)
When it's stopped (warm attach)
When the instance is being launched (cold attach).


Endpoints

Enables you to create private connection between VPC and another AWS service without requiring access over the internet or NAT instance, VPN connection or AWS Direct Connect.
Can create multiple endpoints for a single service.
Two types of endpoint

Interface Endpoint: Interface Endpoints are Elastic Network Interfaces (ENI) with private IP addresses.
Gateway Endpoint: Gateway endpoints is a gateway targeted for a specific route in the route table. Amazon S3 and DynamoDB are the only services that are supported by gateway endpoints


Peering

Peering connection is a networking connection between 2 VPCs that enables instances in either VPC to communicate with each other as if they belong to same network.
Peering is one-to-one relationship between VPCs, two VPCs cannt have two peering agreement between them.
Peering connections do not support transitive routing.
Cannot create a peering connection between VPCs that have matching or overlapping CIDR blocks
Cannt have more than 1 peering connection between same 2 VPC at the same time.
Cannt have IP address overlap
Does support DNS, they can communicate using their DNS name.


Security Groups

Virtual stateful firewall that controls inbound and outbound network traffic to EC2 instances and AWS resources.
All EC2 instances must be launched into a security group.
Up to 500 security groups for each VPC; upto 50 inbound and 50 outbound rules to each security group.
Allow rules can be specified, but not deny rules. This is an important difference between security group and ACLs.
security groups are stateful; responses to allowed inbound traffic are allowed to flow outbound regardless of outbound rules.
By default, no inbound traffic is allowed until u add inbound rule, while new security groups have an outbound rule that allows all traffic.
U can change security group associated with an instance after launch, and the change will take effect immediately.


Network Access Control List(NACL)

Stateless firewall at subnet level
Applied to all instances in a subnet
Allows both allow and deny rules
AWS process rules in number order(starting with lowest number rule) when deciding whether to allow traffic.
Comparison of Security Group and NACL

SG as instance level; NACL at subnet level and are applied to all instances in a subnet.
SG allows allow rules; NACL allows deny and allow rules
SG is stateful; NACL is stateless
SG evaluates all rules before deciding whether to allow traffic; NACL rules are processed in numbered order starting with lowest no.


Network Address Translation(NAT) instances and NAT Gateways

Instances launched in private subnets cannot communicate with internet through IGW. AWS provides NAT instances and NAT Gateway to allow instances in private subnets to gain internet access.
NAT Gateway provides better availability, higher bandwidth and needs less administration then NAT instances.
Difference with Internet Gateway(IGW)

IGW allows instances in public subnet to access internet and vice versa. Two way communication.
NAT allows instances in private subnet to access internet; but not other way round. One way communication.


Setting up NAT gateway

create public subnet to host NAT Gatway. Route table of the subnet should contain a route to IGW
Route table of private subnet(having instances which need access to internet) should have a route to NAT Gateway.


Virtual Private Gateway(VPG), Customer Gateway(CGW) and Virtual Private Network(VPN)

VPG is the VPN connector on the AWS side of the VPN connection between two networks.
CGW is the physical device or software application on the customer side of the VPN connection.
Last step is to create the VPN tunnel, VPN tunnel is established after traffic is generated from the customer's side of the VPN connection.
VPN tunnel has to be initiated from the CGW to VPG
VPN connection consists of 2 tunnels for high availability to the VPC


IPsec is the security protocol supported by AWS

5. Elastic Load Balancing, CloudWatch, Auto Scaling

Elastic Load Balancing


Distributes traffic across a group of EC2 instances in one or more AZs.
Supports routing and load balancing of HTTP, HTTPS, TCP and SSL traffic to EC2 instances.
Provides a stable single CNAME entry point for DNS configuration.
Supports both internet-facing and internal application-facing load balancing.
Supports health checks for EC2 instances to ensure traffic is not routed to unhealthy or failing instances.
Seamlessly integrates with Auto Scaling service to automatically scale EC2 instances behind the load balancer.
Best practice is to reference a load balancer by its DNS name, instead of IP address.
Only 1 subnet per AZ can be enabled for each ELB
ELB in VPC supports IPv4 addresses only. ELB in EC2-classic support both IPv4 and IPv6 addresses.
ELB consists of 2 components - 1)Load Balancers that monitor the traffic and handles incoming requests. 2)Controller Service monitors the load balancers, adding and removing load balancers as needed and verify that load balancers are functioning properly.
Type of load balancer

Classic: takes routing decisions at either transport layer (TCP/SSL) or application layer (HTTP/HTTPS). Requires fixed relationship between load balancer port and container instance port. For example, it is possible to map the load balancer port 80 to the container instance port 3030 and the load balancer port 4040 to the container instance port 4040. However, it is not possible to map the load balancer port 80 to port 3030 on one container instance and port 4040 on another container instance.
Application:

takes routing decisions at the application layer (HTTP/HTTPS). It is context aware and takes into consideration application behaviors like content type, cookie data, user location etc.
Supports dynamic host port mapping.
layer 7 load balancer
WebSockets and Secure WebSockets support is available
Request tracing is enabled by default
Application Load Balancer support HTTPS termination, must install an SSL certificate on your load balancer.
You can configure rules for each of your listeners you configure for the load balancer. The rules include a condition and a corresponding action if the condition is satisfied. The condition will be a path URL path of a service (e.g. /img) and action is forward.
Cross-zone load balancing is already enabled by default in Application Load Balancer.
Supports 3 types of redirection: HTTP to HTTP HTTP to HTTPS, HTTP to HTTPS
If you need flexible application management and TLS termination use this.
For ALB at least 2 subnets must be specified


Network:

layer 4 load balancer
For NLB only one subnet must be specified (recommended to add at least 2)
takes routing decisions at transport layer(TCP/SSL) based on network variables like IP address and destination port. It is context-less, doesnt take application behavior into consideration.
Supports dynamic host port mapping.
If extreme performance and static IP is needed for your application then use this.


Type of load balancer

Internet-facing: Has a publicly resolvable DNS name. Requests come from client outside VPC.
Internal: Doesnt have a publicly resolvable DNS name. In multi-tier application, use load balancer between the tiers of the application. Use internal load balancers to route traffic from client within VPC.
Https load balancer:

Enables traffic encryption between load balancer and clients that initiate HTTPS sessions.
To use SSL, need to install an SSL certificate on the load balancer that it uses to terminate the connection and then decrypt requests from clients before sending requests to back-end EC2 instances.
ELB doesnt support Server Name Indication (SNI) on load balancer. So if u want to host multiple websites on a fleet of EC2 instances behind ELB with a single SSL certificate, u will need to add a Subject Alternative Name(SAN) for each website to the certificate to avoid site users seeing a warning message when the site is accessed.


Listeners

Every load balancer should have 1 or more listeners configured, that looks for connection requests.
Every listener is configured with a protocol and a port for front-end connection and a protocol and port for the back-end connection
ELB supports following protocols: HTTP, HTTPS, SSL, TCP


Idle Connection Timeout

Load balancer maintains 2 connections; 1 with the client and another with back-end instances.
For each connection, load balancer manages an idle time that is triggered when no data is sent over the connection for a specific period of time. By default it's 60 seconds.
If HTTP request doesnt complete within this time, load balancer closes the connection, even if data is still being transferred.
Default timeout can be changed for lengthy operations, like file uploads.
For HTTP and HTTPS listeners, its recommended to enable keep-alive option for EC2 instances. This allows load balancer to reuse connections to back-end instances.
To ensure load balancer is responsible for closing connections to your back-end instances, keep-alive time should be greater than idle timeout setting on load balancer.


Cross zone load balancing

Enable cross zone load balancing to ensure that traffic is evenly routed accross all back-end instances.
Reduces the need to maintain equivalent numbers of instances in each AZ
However it is still recommended to maintain approx equal no of instances in each AZ for higher fault tolerance.


Connection Draining

Enable connection draining to ensure that load balancer stops sending requests to instances that are deregistering or unhealthy, while keeping existing connections open. This ensures that in-flight requests to these instances are completed.
We can specify the max timeout value can be set between 1 and 3600 seconds (default is 300 seconds) for the load balancer to keep connections alive before reporting instances as deregistered.
when an EC2 instance registered with ELB using connection draining is deregistered or unhealthy

keep connection open to complete in-flight requests
redirect requests to a user-defined error page like "Under construction"


Proxy Protocol

In case SSL or TCP is used for both front-end and back-end connections, ELB forwards the request to back-end instances without changing request headers.
If Proxy Protocol is enabled a human-readable header is added to the request header and send to the back-end instance as part of request.


Sticky session

Ensures that all requests from a user during a session are sent to same instance.
2 requirements to configure sticky sessions for Classic Load Balancer

HTTP/HTTOS load balancer
At least 1 healthy instance in each AZ


Health Checks

A health check is a connection attempt, ping or a page that is checked periodically.
ELB does health checks using either default health check configuration or health check configuration defined by you.
A health check is a ping, a connection attempt, or a page that is checked periodically.
U can set time interval, amount of time to wait to respond and a threshold for the number of consecutive health check failures before an instance is marked unhealthy.
Status of instances that are healthy at the time of health check is called InService.
Status of instances that are unhealthy at the time of health check is called OutOfService.
Target Group

Target groups are a logical grouping of targets (EC2 instances or ECS)
Targets are the endpoints and can be EC2 instances, ECS containers, or IP addresses
Target groups can exist independently from the ALB
Target groups can have up to 1000 targets
A single target can be in multiple target groups
Only one protocol and one port can be defined per target group
The target type in a target group can be an EC2 instance ID or IP address (must be a valid private IP from an existing subnet)
You cannot use public IP addresses as targets
You cannot use instance IDs and IP address targets within the same target group
A target group can only be associated with one load balancer
You can only use Auto Scaling with the load balancer if using instance IDs in your target group
You cannot mix different types within a target group (EC2, ECS, IP)


CloudWatch


Used to monitor AWS resources and application in real time.
Has a basic level of monitoring for no cost and a more detailed level of monitoring for an additional cost.
Collect and track metrices, create alarm to send notifications and make changes to resources being monitored based on rules u define.
Basic monitoring sends data points to CloudWatch every 5 min for a limited number of preselected metrices at no charge. Basic monitoring supports Hypersior visible metrics like CPU Utilization
Detailed monitoring sends data points to CloudWatch every minute and data aggregation for an additional charge.
Basic monitoring is enabled by default; detailed monitoring needs to be explicitly enabled.
CloudWatch doesnt aggregate data across regions but across AZs within a region.
Each AWS account has limit of 5000 alarms per AWS account, and metrics data is retained for 2 weeks by default.
Available metrics

CPU utilization
Network utilization
Disk performance
Disk read/write


Custom metrics using perl/other shell script

Memory utilization
disk swap utilization
disk space utlization
page file utilization
log collection


Where to store cloudwatch logs?

Cloudwatch Logs
S3 or Glacier
Centralized logging system like Splunk


CloudWatch has a feature called CloudWatch Logs. CloudWatch Logs can receive logs from CloudTrail, filter through them based on keyword/phrases, take some action
CloudWatch Logs Insights is an interactive, pay-as-you-go, and integrated log analytics capability for CloudWatch Logs. It helps developers, operators, and systems engineers understand, improve, and debug their applications, by allowing them to search and visualize their logs.
CloudWatch retains metric data as follows:

Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.
Data points with a period of 60 seconds (1 minute) are available for 15 days
Data points with a period of 300 seconds (5 minute) are available for 63 days
Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)


Auto Scaling


Service that allows to scale EC2 capacity automatically by scaling out and scaling in.
Auto Scaling plans

Manual scaling: most basic form; u need to specify change in max, min or desired capacity.
Scheduled scaling: scaling actions are performed automatically as a function of time and date.
Automatic scaling: scaling actions are performed automatically based on parameters defined in scaling policy


Components

Launch configuration: template that Auto Scaling use to create new instances.
Auto scaling group: collection of EC2 instances managed by Auto Scaling service
Scaling policy: set of instructions that tells whether to scale out(launch new EC2 instances) or scale in (terminate instances)


Launch configuration

Contains

configuration name
AMI
EC2 instance type
security group
key pair


Each auto scaling group can have only 1 lauch configuration at a time.
Default limit is 100 per region.


Auto scaling group

Contains

Minimum size
Maximum size
Desired capacity
Load balancer


Scaling Policy

set of instructions that tell when to scale out and scale in
more than 1 scaling policies can be associate with a scaling
best practice is to scale out quickly and scale in slowly


EC2 Auto Scaling vs Auto Scaling

AWS Auto Scaling if you want more guidance on defining your application scaling plan, or if you want to scale multiple resources beyond EC2, such as Amazon DynamoDB tables and indexes, or Amazon ECS tasks. At this time, to use AWS Auto Scaling, you must create your applications via AWS CloudFormation or AWS Elastic Beanstalk
use Amazon EC2 Auto Scaling if you only need to scale Amazon EC2 Auto Scaling groups (ASG), or just want to maintain the health of your EC2 fleet


Fleet Management and dynamic scaling

Fleet management refers to the functionality that automatically replaces unhealthy instances and maintains your fleet at the desired capacity.
dynamic scaling capabilities of Amazon EC2 Auto Scaling refers to the functionality that automatically increases or decreases capacity based on load or other metrics. For example, if your CPU spikes above 80% (and you have an alarm setup) Amazon EC2 Auto Scaling can add a new instance dynamically.


Target Tracking

Target tracking is a new type of scaling policy that you can use to set up dynamic scaling for your application in just a few simple steps. For example, you can configure target tracking to keep CPU utilization for your fleet of web servers at 50%. From there, Amazon EC2 Auto Scaling launches or terminates EC2 instances as required to keep the average CPU utilization at 50%.


Auto scaling can not span AWS Regions
EC2 Auto Scaling groups optimize for the case when all your instance types are the same. You can use the AttachInstances API to attach instances of different types to an Auto Scaling group, and you can also update your launch configuration so that any new instances in the group will be launched with a different instance type. However, this will not affect any of the existing instances.
Lifecycle Hooks let you take action before an instance goes into service or before it gets terminated. This can be especially useful if you are not baking your software environment into an Amazon Machine Image (AMI). For example, launch hooks can perform software configuration on an instance to ensure that it’s fully prepared to handle traffic before Amazon EC2 Auto Scaling proceeds to connect it to your load balancer. Terminate hooks can be useful for collecting important data from an instance before it goes away
If you are using Elastic Load Balancing (ELB) with your group, you should select an ELB health check. If you’re not using ELB with your group, you should select the EC2 health check.
When an impaired instance fails a health check, Amazon EC2 Auto Scaling first automatically terminates it and replaces it with a new one.
Use termination policy to control which instances Amazon EC2 Auto Scaling terminates when scaling in.
You can define Instance Protection which stops Auto Scaling from scaling in and terminating the instances
It is recommended to create a scale-in event for each scale-out event created
Auto Scaling rebalances by launching new EC2 instances in the AZs that have fewer instances first, only then will it start terminating instances in AZs that had more instances
Auto Scaling may go over the maximum number of instances by 10% temporarily for the purposes of rebalancing
Unlike AZ rebalancing, termination of unhealthy instances happens first, then Auto Scaling attempts to launch new instances to replace terminated instances
The cooldown period is a configurable setting for your Auto Scaling group that helps to ensure that it doesn’t launch or terminate additional instances before the previous scaling activity takes effect
The warm-up period is the period of time in which a newly created EC2 instance launched by ASG using step scaling is not considered toward the ASG metrics

6. Identity and Access Management


Service that allows how to control people and programs are allowed to manipulate AWS infrastructure.
What IAM is not

Its not an identity store/authorization system for your applications. The permissions that are assigned are permissions to manipulate AWS infra and not permissions within your application.
For on-premise applications migrated to AWS, existing user repository and authentication/authorization mechanism should be used.
IAM is not operating system identity management.


AWS Directory Service provides directory service that can work on its own or integrate with on-premise Active Directory.
For mobile apps, Amazon Cognito can be used for identity management.
Principals

an IAM entity that is allowed to interact with AWS resources
can be permanent or temporary
3 types of principals

Root User
IAM user
roles/temporary security tokens: a) Amaxon EC2 roles b)Cross account access c)Federation


Root user is created when an AWS account is created.
IAM users

are persistent identities set up through IAM service to represent individual people or application.
are persistent as there is no expiration period.


Roles/Temporary security tokens

roles are used to grant specific privelges to actors for a duration of time.
AWS provides a temporary security token to the actor which can be used to access
range of token lifetime is 15 min to 36 hours
enables number of use cases: Amazon EC2 role, Cross Account Access, Federation


Amazon EC2 roles

used to grant permissions to application running on an EC2 instance.


Cross Account Access roles

IAM roles that grant access to AWS resources to IAM users in other AWS account. These accounts can be owned by ur company or 3rd parties like customers and suppliers.


Federation IAM role

Applicable for cases where organizations already have an identity repository and would like to use it instead of duplicating the repository in AWS.
Similarly web applications can use identities existing in Facebook, Google.
AWS IAM can integrate with 2 different types of outside Identity Providers(IdP) for federation - a) OpenID Connect to integrate with Facebook, Google etc. b) Active Directory Service to integrate with LDAP or Active Directory using SAML


Authentication

3 ways that IAM authenticates a principal : a) User Name/Password b) Access key c) Access key/Session token
User Name/Password

Used by for IAM Users
Inputs: user name, password


Access key

Used by Application code to access AWS Infrastructure using AWS API/SDK
Inputs: Access Key ID, Access Secret Key


Access key/Session Token

Used by IAM User or Application under an assumed role having temporary security token
Inputs: Access Key ID, Access Secret Key, Session token


Authorization

Done by defining specific privileges in policies and associating those policies with principals.
Policy is a JSON document that fully defines set of permissions to access AWS resources.
Policy contains following

Effect: Allow or Deny
Resource: Amazon Resource Name (ARN) of the resource
Action
Condition


Sample policy

      {
       "Version": "2012-10-17",
       "Statement": [
         {
           "Effect": "Allow",
           "Action": ["s3:ListBucket"],
           "Resource": ["arn:aws:s3:::<BUCKET-NAME>"],
           "Condition": { "IpAddress": { "aws:SourceIp": "192.168.0.1"}       
         },
         {
           "Effect": "Allow",
           "Action": [
             "s3:PutObject",
             "s3:GetObject"
           ],
           "Resource": ["arn:aws:s3:::<BUCKET-NAME>/*"]
           "Condition": { "IpAddress": { "aws:SourceIp": "192.168.0.1"}       
         }
       ]
     }


Associating policies with principals

User Policy: Policies exists only in the context of an IAM User to which they are attached.
Managed Policy: Policies exists independent of IAM User, same policy can be associated with many users or groups. There are a number of predefined managed policies that are found on the Policies tab of IAM page.


Associating policies with Groups

Group policy: Policies exists only in the context of a Group.
Managed Policy: Can be associated with many groups.


Rotating Keys

Security risk of any credential increases with age of credential
best practice is to rotate access keys associted with IAM users
IAM allows 2 active access keys at a time.


IAM is universal (global) and does not apply to regions
IAM is eventually consistent
IAM replicates data across multiple data centres around the world
Power user access allows all permissions except the management of groups and users in IAM
Security Token Service

a web service that enables you to request temporary, limited-privilege credentials for IAM users or for users that you authenticate (federated users)
Temporary security credentials work almost identically to long-term access key credentials that IAM users can use, with the following differences:

Temporary security credentials are short-term
They can be configured to last anywhere from a few minutes to several hours
After the credentials expire, AWS no longer recognizes them or allows any kind of access to API requests made with them
Temporary security credentials are not stored with the user but are generated dynamically and provided to the user when requested
When (or even before) the temporary security credentials expire, the user can request new credentials, as long as the user requesting them still has permission to do so


7. Databases and AWS


A relational database can be either

Online Transaction Processing (OLTP)
Online Analytic Processing (OLAP)


OLTP transactions occur frequently and are relative simple; OLAP transactions occur less frequently and are much more complex.

Amazon Relational Database Service (RDS)


Exposes a database endpoint to which client software can connect and execute SQL.
RDS support 6 popular RDBMS.
DB Instances

An isolated database environment deployed in your private network segments in cloud.
RDS provides an API that can be used to create and manage one of more DB Instance.
Each instance runs and manages commercial or open source engine on your behalf.
A DB instance can contain multiple different databases.
Create a new DB Instance by calling CreateDBInstance API and change or resize using ModifyDBInstance.
Compute and memory resources of a DB Instance is determined by its DB Instance Class.
DB Instance Class can be changed, and RDS will migrate data to a large or small instance class.
Many features and common configuration settings are exposed and managed using

DB Paramater groups:

container for engine configuration values that can be applied to 1 or more db instances. You can change DB parameter group for an existing instance, but a reboot is needed.
A default DB parameter group is created if you create a DB instance without specifying a customer-created DB parameter group. You cannot modify the parameter settings of a default DB parameter group; you must create your own DB parameter group to change parameter settings from their default value.
When you change a dynamic parameter and save the DB parameter group, the change is applied immediately regardless of the Apply Immediately setting. When you change a static parameter and save the DB parameter group, the parameter change will take effect after you manually reboot the DB instance.


DB Option groups: Some DB engines offer additional features that make it easier to manage data and databases, and to provide additional security for your database. Amazon RDS uses option groups to enable and configure these features


When you modify a DB instance, Amazon RDS will reboot the instance if both of the following are true:

You change the DB instance class.
You specify a custom parameter group.


AWS Data Migration Service

Gives a graphical interface that simplifies migration of both schema and data between databases.
Also helps convert databases from 1 database engine to another.


Database engines

RDS supports six database engines: MySQL, PostgresSQL, MariaDB, Oracle, SQL Server, Amazon Aurora.
Licensing: For commercial DB engines like Oracle and MS SQL AWS provides 2 licensing option

License Included: License is held by AWS and included in RDS instance price
Bring Your Own License(BYOL): Customer provides required license.


Amazon Aurora

Delivers upto 5 times more performance of MySQL without requiring changes to most of existing web application.
When you create Aurora instance, a DB cluster is created. DB Cluster has 1 or more instances and includes a cluster volume that manages data for those instances.
Primary Instance: Each DB cluster has 1 primary instance. This is the main instance, that supports both read and write workloads.
Replica Instance: Secondary instance that supports only read operations. DB cluster can have upto 15 Replica instances in addition to primary instance
Failover is automatically handled by Amazon Aurora so that your applications can resume database operations as quickly as possible without manual administrative intervention.

If you have an Amazon Aurora Replica in the same or a different Availability Zone, when failing over, Amazon Aurora flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica, which in turn is promoted to become the new primary. Start-to-finish, failover typically completes within 30 seconds.
If you do not have an Amazon Aurora Replica (i.e. single instance), Aurora will first attempt to create a new DB Instance in the same Availability Zone as the original instance. If unable to do so, Aurora will attempt to create a new DB Instance in a different Availability Zone. From start to finish, failover typically completes in under 15 minutes.


Storage Options

Build using Amazon Elastic Block Storage (EBS). Storage types:

Magnetic
General purpose SSD
Provisioned IOPS (SSD)


Backup and Recovery

Provides 2 mechanism for backup

Automated
Manual


Recovery Point Objective (RPO): max period of data loss acceptable in case of failure
Recovery Time Objective (RTO): max period of downtime acceptable to recover from backup and resume processing.
RPO is generally measured in minutes while RTO is measured in hours or days (based on application criticality).
Automated backup

RDS continuously tracks changes and backs up database.
Creates a storage volume snapshot of DB instance
Backs up entire DB instance and not just individual databases.
By default, backups are retained for 1 day, can be increased upto 35 days. This period is called backup retention period.
When an DB instance is deleted, automated backup snapshots are also deleted, cannt be recovered.
Backup occurs daily during a configurable 30 minute maintenance window, called backup window.


Manual backup

Done manually at any time.
While automated backups are deleted after retention period is over, manual backups are kept till explicitly deleted.
Manual backups remain even if the DB instance is deleted.


Recovery

DB snapshot cannt be restored to an existing DB Instance
A new DB instance is created when u restore
Only default DB Parameter and security groups are associated with restored instance, u need to associate custom DB parameter and security groups.


Multi-AZ deployment

Synchronous replication helps to minimize RPO and fast failover to minimize RTO.
U need to place a secondrary copy of the database in another AZ for disaster recovery purpose.
Multi AZ deployment is available for all types of RDS database engine.
RDS will automatically failover to the standby instance in case of failure; DNS name remains the same, RDS changes CNAME to point to the standby.
Multi-Zone deployments are for disaster recovery and not enhancing performance.


DB Subnet Groups

A DB subnet group is a collection of subnets (typically private) that you create in a VPC and that you then designate for your DB instances.
Each DB subnet group should have subnets in at least two Availability Zones in a given region.
Amazon RDS uses that DB subnet group and your preferred Availability Zone to select a subnet and an IP address within that subnet to associate with your DB instance.
If the primary DB instance of a Multi-AZ deployment fails, Amazon RDS can promote the corresponding standby and subsequently create a new standby using an IP address of the subnet in one of the other Availability Zones.


To improve performance using multiple DB instances, use read replicas or other DB caching technologies such as Amazon ElastiCache
Scaling up and out
As no of transaction increases in RDBMS, scaling up to larger machine allows to process more read/write.
Scaling out(horizontal scalability) is also possible, but often more difficult.
Vertical Scalability

Select a different DB Instance class.
AWS RDS automates migration of data to new DB Instance class with only short disruption.


Horizontal Scalability with partitioning

Partitioning or sharding large relational database into multiple instances or shards is a common technique.
Sharding however needs additional logic in application layer - to decide how to route database requests to correct shard.


Horizontal Scalability with read replicas

Read replicas helps to offload read transactions from primary database.
Useful for read heavy database workloads.
Read replicas are supported in RDS for MySql, PostgreSQL, MariaDB and Aurora.
Updates to primary(source) db instance are asynchronously copied to read replicas.
1 or more replicas can be created within a single AWS region or across multiple AWS regions.


Security

Deploy RDS DB Instances into a private subnet with an Amazon VPC that limits network access.
Before deploying DB Instance, create a DB Subnet Group that predefines which subnets are available for RDS deployments.
Limit inbound traffic using security groups.
Data in transit can be secured using SSL
Data at rest can be secured using encryption: Amazon Key Management Service(KMS) or Transparent Data Encyption(TDE)


Amazon Redshift


REdshift is a relational database designed for OLAP scenarios.
Optimized for high performance analysis and reporting of very large datasets.
Give fast querying capabilities over structured data using standard SQL commands.
Based on PostgreSQL, so existing client applications should work with minimal changes.
Clusters and Nodes
Key component of RedShift is a cluster which has a leader node and multiple compute node. Client applicatiosn interacts directly with the leader node using standard JDBC or ODBC connection. Leader node in turns coordinates query execution with compute nodes. The disk storage for a compute node is divided into a number of slices. No of slices per node varies from 2 to 16. All nodes participate in parallel query execution, working on data that is distributed as evenly as possible across the slices.
There are 6 different types of node types; each have diff mix of CPU, memory and storage.
6 node types are grouped into 2 categories

Dense compute:supports upto 326TB using fast SSDs
Dense storage:supports upto 2PB using magnetic tapes


Each cluster containes 1 or more databases.
Table data is distributed across compute nodes in the cluster based on a distribution strategy that u specify.
Redshift allows to resize a cluster to add storage and compute capacity. When resize operation is done, Redshift will create a new cluster and migrate data from old cluster to new one.During resize operation, database will become read-only until operation is complete.
Table Design

Each cluster can support 1 or more database, each database can have many tables.
1 of the performance optimization used by RedShift is data compression. When loading data, Redshift automatically samples data and select best compression scheme for each column.
Distribution Strategy

Distribution style hints Redshift how data should be partitioned to best meet query pattern.
3 distribution styles

Even: data distributed across slices in a uniform fashion.
Key: rows are distributed based on key value
All: Full copy of entire table is distributed to every node.


For bulk loading of data, use COPY command. Data can be exported out of Redshift using UNLOAD command.


Row store vs Column store

At a basic level, row stores are great for transaction processing. Column stores are great for highly analytical query models. Row stores have the ability to write data very quickly, whereas a column store is awesome at aggregating large volumes of data for a subset of columns.
In online transaction processing (OLTP) applications, most transactions involve frequently reading and writing all of the values for entire records, typically one record or a small number of records at a time. As a result, row-wise storage is optimal for OLTP databases.
OLTP transactions typically involve most or all of the columns in a row for a small number of records, data warehouse queries commonly read only a few columns for a very large number of rows. you can save memory space by only retrieving blocks for columns you actually need for a query
Amazon Redshift is a column-oriented, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Amazon Redshift achieves efficient storage and optimum query performance through a combination of massively parallel processing, columnar data storage, and very efficient, targeted data compression encoding schemes.


DynamoDB


Provides consistent performance levels by automatically distributing the data and traffic for a table over multiple partitions.
When u create DynamoDB Table, you are required to provision a certain amount of read and write capacity to handle ur expected workloads.
As demand changes over time, u can adjust read and write capacity after a table has been created, and DynamoDB will add or remove infrastructure and adjust internal partitions.
All table data is stored on high performance SSD disk drives.
Provides high availability and durability by replicating data across multiple AZs with a Region.
Table Design

Basic components of data model is tables, items and attributes.

Item is similar to record and attribute is similar to column in RDBMS


It doesnt need to define all attribute name and types in advance. Individual items can have any number of attributes.
Each attribute in an item is a name/value pair. An attribute can be single valued or multi-valued set. Multi-valued attribute is a Set, duplicate values are not allowed.
Application can connect to DynamoDB service endpoint and submit read/write request in JSON format.


Data Type

3 major category of data type: Scalar, Set and Document
Scalar:represent exactly one value

String Text
Number
Boolean
Binary
Null


Set Data type

String set
Number set
Binary set


Document Data type

List: ordered list of attributes of diff data type
Map:Unordered list of key/value pairs


Primary key

Partition key:

Made of 1 attribute
builds an unordered hash index on primary key attribute
Must be String, Number or Binary


Partition and Sort key:

Made of 2 attributes; Partition key and Sort(or Range) key.
Each item in the table is uniquely identified by combination of partition key and sort key. So 2 items can have same partition key, but their sort key has to be different.


Secondary Indexes: Can optionally define 1 or more Secondar Indexes if table was created with Primary and Sort keys. This allows querying the data using alternate keys in addition to Primary key. 2 Types of secondary indexes

Global Secondary Index:

Index with Partition and Sort key different from that of table.
There can be many global secondary index.
You can create or delete global secondary index on a table at any time.


A global secondary index lets you query over the entire table, across all partitions.
Local Secondary Index:

Index with Partition key same as Primary key of table, but different Sort key.
Can only be created when the table is being created.
Only 1 local secondary index possible.


A local secondary index lets you query over a single partition, as specified by the hash key value in the query


Eventual Consistency

Eventually Consistent Read: Result may not reflect result of a recently completed write operation
Strongly Consistent Read: Returns most up-to-date data that includes all recent write operations. May be less available in case of network delay or outage.


Searching Items

Query

Each Query requires a partition key attribute name and a distinct value to search. You can optionally provide a sort key value and use a comparison operator to refine the search results
Results are automatically sorted by primary key and are limited to 1 MB
Performance better than scan


Scan

Reads each item in a table or a secondary index.
Each result can return up to 1MB of data.If the result set for a Query or a Scan exceeds 1MB, you can page through the results in 1MB increments.
Performing a Scan operation will result in a full scan of the entire table or secondary index, then it filters out values to provide the desired result.


Dynamo Streams

A requirement of some applications is to keep track of recent changes and then perform some kind of processing on changed records. Dynamo Streams provides list of item modifications for last 24-hr period.


Scaling and Partitioning

An Amazon DynamoDB table can scale horizontally through the use of partitions to meet the storage and performance requirements of your application.
When a table is created, Amazon DynamoDB configures the table’s partitions based on the desired read and write capacity
Amazon DynamoDB stores items for a single table across multiple partitions. Amazon DynamoDB decides which partition to store the item in based on the partition key. The partition key is used to distribute the new item among all of the available
partitions, and items with the same partition key will be stored on the same partition.
As storage or capacity requirements change, Amazon DynamoDB can split a partition to accommodate more data or higher provisioned request rates. After a partition is split, however, it cannot be merged back together.
To maximize Amazon DynamoDB throughput, create tables with a partition key that has a large number of distinct values and ensure that the values are requested fairly uniformly. Adding a random element that can be calculated or hashed is one common
technique to improve partition distribution.


Item level access control

Using Fine Grained Access Control (FGAC)
Using Per-Client Embedded Token


8. SQS, SWF, SNS

Simple Queue Service(SQS)


A fast, reliable, scalable and fully managed message queueing service.
Although most times each message will be delivered to your application exactly once, you should design your system to be idempotent.
Standard Queue

Default queue type, support nearly unlimited number of transactions per second (TPS) per action
Support at-least-once message delivery. However, more than one copy of a message might be delivered out of order.
Best effort ordering which ensures that message are generally delivered in the same order as they are sent.


FIFO Queue

Order of message is maintained
No duplicate delivery of message
But have supports limited number of transactions per second


Dead letter queue

useful for debugging your application or messaging system because they let you isolate problematic messages to determine why their processing doesn't succeed
You must first create a normal standard or FIFO queue before designating it a dead-letter queue
The dead-letter queue of a FIFO queue must also be a FIFO queue. Similarly, the dead-letter queue of a standard queue must also be a standard queue


Doesn't guatantee First In, First Out (FIFO) delivery of message, order can be any.
If your system need message order, add sequence information in each message.
Delay Queues

If you create a delay queue, any message sent to that queue will not be visible to any consumer for the duration of delay period.
Existing queues can be turned into a delay queue.
Default value for DelaySeconds is 0. Range is between 0 to 900 sec (15 min)


Visibility Timeouts

Hides a message after that message is retrieved from the queue. If a consumer retrieves the message from queue, the same message will not be visible for retrieval to other consumer for the duration of visibility timeout.
When a message is in queue, but is neither delayed nor in a visibility timeout, it is considered to be "in flight".
Supports upto 12 hrs of visibility timeout.


Message lifecycle

Always remember that the messages in the SQS queue will continue to exist even after the EC2 instance has processed it, until you delete that message
Message lifecycle


Message Unique ID: a globally unique ID that SQS returns when message is delivered to the queue.
Message receipt handle: when a message is retrieved from queue, the receipt includes a receipt handle, which must be provided when deleting the message.
Queue URL:

when creating a new queue, you must provide a queue name which is unique within the scope of all your queues.
SQS assigns each queue a Queue URL, which includes queue name and other components that SQS determines.
when u want to perform any action on a queue, queue URL needs to be mentioned.


Message Attributes:

Are optional and seperate from, but sent along with message body. Each message can have 10 attributes
The receiver of message can use these attributes to decide how to handle the message without processing the message body.


Long Polling:

ReceiveMessage function checks existence of a message in queue and return immediately, either with or without message.
If SQS client repeatly checks for new message, constance call to ReceiveMessage burn CPU cycle. Here Long Polling can help.
With long polling, u send a WaitTimeSeconds argument to ReceiveMessage of upto 20 seconds.
If no message is there in queue, call waits for time upto WaitTimeSeconds.
If a message appears before WaitTimeSeconds, the call return along with message.


SQS Access Control: Assigns polices to queues that grant specific interactions to other accounts without that account having to assume IAM roles from your account.
SQS doesnt return success to a SendMessage API call until message is durably stored in Amazon SQS.
Statistics

Longest time available for SQS long polling timeout: 20 sec
SQS message retention period: 4 days(deafult); 14 day(max)
SQS visibility timeout time: 30 sec (default); 12 hrs (max)


Simple Workflow Service(SWF)


Makes it easy to coordinate(inter-task dependencies, scheduling, concurrency) work across distributed components.
Implement workers to perform tasks. These workers can run on cloud infra or on-premise.
Workflow coordinate and manage execution of activities that can be run asynchronously across multiple computing devices and that can feature both sequential and parallel processes.
Workflow domain

Provides a way to scope SWF resources within your AWS account.
It is possible to have more than 1 workflow in a domain
workflows in different domains cann't interact with each other.


Actors

3 types actors: i)workflow starter ii)Decider iii)Activity workers
Workflow starter: Initiates workflow execution
Decider: coordinates (sequence,paralle, sync, async) tasks in a workflow
Activity Worker: single computer process that performs activity tasks in your workflow


Tasks

SWF provides activity workers and deciders with tasks.
3 types of tasks: i) Activity tasks ii) Lambda task iii) Decision tasks
Activity Tasks: tells an activity worker to perform its functions like check inventory or check credit card.
Lambda tasks: Similar to Activity task, but executes as Lambda function
Decision task: Tells a decider that state of workflow execution has changed so that the decider can determine next activity that needs to be performed.


Task Lists

A way to organize various tasks associate with a workflow, think of task lists as similar to dynamic queue.
When a task is scheduled in SWF, you can specify a queue(task list) to put it in; similarly when u poll SWF for a task, determine which queue(task list) to get the task from.


Long Polling

Deciders and activity workers communicate with SWF using long polling. Notifies SWF of its availability to accept a task and then specifies a task list to get tasks from. Long polling works well for high-volume task processing.


Object Identifier

Workflow type is identified by its domain, name and version
Activity type is identified by its domain, name and version
Decision task and activity task is identifies by a unique task token.
A single execution of a workflow is identified by domain, workflow ID and run ID.


Workflow Execution Closure

After start of an workflow it is open. An open workflow execution can be closed as completed, canceled, failed or timed out.


Simple Notification Service(SNS)


Enables u to setup, operate and send notifications.
Follows pub-sub messaging paradigm
Consists of 2 types of clients: i)publishers ii) subscribers
A SNS topic is a logical access channel that contains a list of subscribers and the methods to communicate to them. When a message is sent to topic, it automatically forwards the same to each subscribers.
A SNS message cannt be delete once it is published to a topic.
A SNS topic name is available after 30-60 sec after the previous topic with the same name has been deleted.
Publishers

event sources like AWS services like Compute, Storage, Database etc.


Following formats or transports are available as subscribers

Http/https
Email, email-json
SQS
SMS


Message filtering

empowers the subscriber to create a filter policy, so that it only gets the notifications it is interested in, as opposed to receiving every single message posted to the topic.
message filtering operators for numeric matching, prefix matching, and blacklisting in Amazon SNS


9. Amazon Route 53


DNS uses port number 53 to serve request ... so the name?

Domain Name System(DNS)


DNS is globally distributed service that translate human readble names into IP addresses that computes use to connect to each other.
Uses a hierarchial name structure, different levels in the hierarchy are separated with a dot(.)
Route 53 is an authoritative DNS system. An authoritative DNS system provides an update mechanism that developers use to manage their public DNS names. It then answers DNS queries, translating domain names into IP addresses so that computers can communicate with each other.
Top-Level Domains(TLDs)

Most general part of the domain, farthest right portion.
Common TLDs are .com,.net,.org,.gov,.edu
Each domain name gets registered in a central database known as WhoIS database.


Domain Name

Human friendly name that are associated with an internet resource


Hosts

Within a domain, domain owner can define individual hosts, which refer to separate or service accessible through a domain.
For example, most often domain owners make their web servers accessible through base domain (example.com) and also through host definition www (like www.example.com)


Subdomains

A large domain in DNS can be partitioned or extended into multiple subdomains.
TLDs can have many subdomains under them.
For example, example.com is a subdomain under .com
Difference between host name and subdomain is that a host defines a computer or resource, while subdomain extends the parent domain.


Fully Qualified Domain Name (FQDN)

In FQDN api.aws.amazon.com.

host: api
TLD: com
SLD: amazon
sub domain: aws.amazon.com
root: .


Name Servers

Computer designated to translate domain names into IP addresses.
As number of domain translations is too much for any 1 server, each server may redirect requests to other name servers or delegate responsibility for the subset of subdomains for which they are responsible.


Zone files

Simple text file that contains mapping between domain names and IP addresses.
Resides in name servers.
Describes a DNS zone, which is a subset of the entire DNS.


TLD level Domain Name Registrars

A domain name registrars is an organization or commercial entity that manages the reservation of Internet domain names.
Registrars ensures that domain names are not duplicated.
A domain name registrar must be accrediated by a generic TLD(gTLD) registry and/or a country code TLD(ccTLD) registry.


Steps involved in Domain Name System(DNS) resolution

When a domain name is typed in browser, computer first checks its host file to see if it has that domain name stored locally.
If not, it checks its DNS cache to see if site was visited before.
If not, it contacts DNS server to resolve domain name.
DNS is a hierarchial system, with root servers at the top. The root server wont know where the domain is hosted. It will direct the request to name server that handles TLD. For ex, if request is for www.exanple.com, it will check zone files for listing that matches domain name, but it wont get. It will instead find a record for .com TLD and direct the request to name server responsible for .com addresses.
After root server returns IP adderss of TLD level name server that is responsible for TLD of a request, requester then sends a new request for www.example.com to that address.
Again name server searches its zone files for an entry of www.example.com. It doent find that, instead it finds name server responsible for example.com and sends the IP address to the requester.
Next requester sends a new request to the name server. Now the name server finds an entry for IP address of host(.www). The name server returns the final address to the requester.


Resolving Name Servers

Last section mentions requester. The requester is actually resolving name server, which is a server configured to ask other servers questions.
A user usually have a few resolving name server configured in their computer. Resolving name server are typically provided by Internet Service Provider(ISP).


Record Types

Start of Authority (SOA) record

mandatory in all zone files and it identifies the base DNS information about the domain.
each zone file contains a single SOA record


A: to map host to IP4 address
AAAA: maps host to IP6 address
Canonical Name(CNAME): resource record in DNS that defines an alias for CNAME for server. A name that points to another name.
Mail Exchange(MX): defines mail server used for the domain
Name server (NS): used by TLD server to direct traffic to DNS server
Pointer(PTR): reverse of a record, maps an IP address to a DNS name.
Sender Policy Framework(SPF): Used by mail servers to combat spam. Denotes what IP addresses are authorized to send an email from your domain name.
Text(TXT): holds text information.
Service(SRV): is a specification of data in the DNS defining the location.


Amazon Route 53


Highly available and scalable DNS web service
Performs 3 functions

Domain registration
DNS service
Health checking


Whenever someone enters your domain name in browser or sends you an email, DNS request is forwarded to nearest AWS Route 53 DNS service. Route 53 responds with the IP address.
When u register a new domain name with AWS Route 53, it is automatically configured as DNS service for the domain and a hosted zone is created for the domain. You add resource record to the hosted zone, which defines how you want Route 53 to respond to DNS queries for your domain.
If you already have a domain name with another domain register, you can transfer DNS service to Amazon Route 53 with or without transferring registration of the domain.
If you are using CloudFront, S3 or ELB , you can configure record in hosted zone so that Route 53 routes traffic to those resources.
Port 53 is used to server requests by DNS.
User Datagram Protocol(UDP) is primarily used by DNS to serve requests.
Transmission Control Protocol(TCP) is used by DNS when response data size exceeds 512 bytes.
Hosted Zone

Collection of resource record sets hosted by Amazon Route 53.
Types

Private: holds information about how you want to route traffic for a domain and its subdomains within 1 or more VPCs. Only responds to queries coming from within associated VPC and it is not used for hosting a website that need to be publicly accessed.
Public: holds information about how you want to route traffic on the internetfor a domain and its subdomains.


Routing Policy

determines how Amazon Route 53 responds to queries.
Types

Simple: default policy, used when you have a single resource that performs a function for your domain
Weighted: Use this when there are multiple resources and u want Route 53 to route traffic to those resources in proportions that u specify. For example, u can use for load balancing between diff Regions or to test new versions of your website (10% to test env while 90% to prod)
latency based: route traffic based on lowest network latency. Use this when u have resources that perform the same function in multiple AZs or Regions and u want traffic to be routed to resource that provides best latency.
failover:

use this to configure active-passive failover, where 1 resource takes all traffic when its availble and other resource takes all traffic when the first resource is not available.
Failover resource record is not possible for private hosted zones
Used for DR


geolocation: with this traffic will be directed based on the geographic location of the user.


Health Checks

Amazon Route 53 health checks monitor health of resources like web servers and email servers. CloudWatch can be configured so that notification is send when resources are unavailable.
If application is deployed on multiple AZs and multiple Regions, with Route 53 health checks attached to every endpoints, Route 53 send back a list of healthy endpoints only.
Route 53 health checks are not triggered by DNS queries, they are run periodically by AWS and results published to all DNS servers.


Resilience using Amazone Route 53

In every AWS region, an Elastic Load Balancing load balancer is set up with cross-zone load balancing and connection draining. This distributes the load evenly across all instances in all Availability Zones, and it ensures requests in flight are fully served before
an Amazon EC2 instance is disconnected from an Elastic Load Balancing load balancer for any reason.
Each Elastic Load Balancing load balancer delegates requests to Amazon EC2 instances running in multiple Availability Zones in an auto-scaling group. This protects the application from Availability Zone outages, ensures that a minimal amount of instances
is always running, and responds to changes in load by properly scaling each group’s Amazon EC2 instances.
Each Elastic Load Balancing load balancer has health checks defined to ensure that it delegates requests only to healthy instances.
Each Elastic Load Balancing load balancer also has an Amazon Route 53 health check associated with it to ensure that requests are routed only to load balancers that have healthy Amazon EC2 instances.
The application’s production environment (for example, prod.domain.com) has Amazon Route 53 alias records that point to Elastic Load Balancing load balancers. The production environment also uses a latency-based routing policy that is associated with Elastic Load Balancing health checks. This ensures that requests are routed to a healthy load balancer, thereby providing minimal latency to a client.
The application’s failover environment (for example, fail.domain.com) has an Amazon Route 53 alias record that points to an Amazon CloudFront distribution of an Amazon S3 bucket hosting a static version of the application.
The application’s subdomain (for example, www.domain.com) has an Amazon Route 53 alias record that points to prod.domain.com (as primary target) and fail.domain .com (as secondary target) using a failover routing policy. This ensures www.domain.com routes to
the production load balancers if at least one of them is healthy or the “fail whale” if all of them appear to be unhealthy.
The application’s hosted zone (for example, domain.com) has an Amazon Route 53 alias record that redirects requests to www.domain.com using an Amazon S3 bucket of the same name.
Application content (both static and dynamic) can be served using Amazon CloudFront. This ensures that the content is delivered to clients from Amazon CloudFront edge locations spread all over the world to provide minimal latency. Serving dynamic content
from a Content Delivery Network (CDN), where it is cached for short periods of time (that is, several seconds), takes the load off of the application and further improves its latency and responsiveness.
The application is deployed in multiple AWS regions, protecting it from a regional outage.


10.Amazon ElastiCache


Memcache is a simple in-memory key-value store that can be used to store arbitrary types of data.
Redis is a in-memory data structure store that can be used as a cache, database or even as a message broker.
AWS ElastiCache supports both memchane and redis.
With AWS ElastiCache u can start using the service today with very few or no modifications to your existing applications that use Memcache or Redis. You only need to change the endpoints in your configuration files.
With Redis Engine, ElastiCache makes it easy to set up read replicas and fail over from the primary to a replica in the event of a problem.
Memcached

With ElastiCache, you can elastically grow and shrink a cluster of Memcached nodes to meet your demands.
You can partition your cluster into shards and support parallelized operations for very high performance throughput.
Memcached deals with objects as blob that can be retrieved using a unique key. What u put into the object is upto you.


Redis

Beyond object support in Memcached, Redis supports a rich set of data types like strings, lists and sets.
Unlike Memcached, Redis supports ability to persist in-memory data onto disks.
Redis clusters also support upto 5 read replicas to offload read requests.In case of failure of primary node, a read replica can be promoted and become new master using multi-AZ replication group.


Nodes and clusters

Each deployments of ElastiCache consiste of 1 or more nodes in a cluster
A single cluster of Memcached cluster can contain upto 20 nodes.
For Memcached clusters partitioned across multiple nodes, ElastiCache supports Auto Discovery with provided client library.
Redis cluster are always made up of single node; however multiple clusters can be grouped into a Redis replication group.


Replication and Multi-AZ

Cache cluster running Redis support replication groups.
A replication group consists of upto 6 clusters, with 5 of them designated as read replicas.
Replications between clusters in performed asynchronously and there is small delay before data is available on all cluster nodes.
MemCached however are standalone in-memory service without any redundant data protection service.


Backup and recovery

ElastiCache clusters running Redis allow to persist data from in-memory to disk and create a snapshot.
Best practice is to set up a replication group and perform a snapshot against one of the read replicas instead of primary node.
Memcached is purely in-memory and doesnt have native backup capabilities.


11.Other Key Services


Amazon CloudFront

Global Content Delivery Network(CDN) service.
CDN is a globally distributed network of caching servers that speed up downloading of web pages and other content.
CDNs use Domain Name System(DNS) geo-locations to determine geographic location of each request for a content, and then serve it from edge caching server closest to that location instead of original web server.
CloudFront supports all content that can be served over HTTP or HTTPS.
CloudFront supports media streaming, using both HTTP and RTMP (Real Time Messaging Protocol).
Dynamic content is also cached along with static content
Doesn’t cache responses for PUT, POST, DELETE, PATCH. Only GET method is cached
In case of Geo restrictions; blacklisted countries sess 403 error
3 core concepts

distributions
origins
cache control


Distributions

U have to create a distribution to use CloudFront.
Is identified by a DNS domain name such as d1111dbc.cloudfront.net
To serve files from CloudFront, simply use distribution domain name in place of your website's domain name; rest of the file path stays unchanged.
If you add a CNAME for www.abc.com to your distribution, you also need to create a CNAME record with your DNS service to route queries from www.abc.com to d11111abcdef8.cloudfront.net


Origins

when creating distribution, u need to specify the DNS domain name of the origin(S3 bucket or HTTP server) from which CloudFront will get definitive version of objects (web files)


Cache Control

Helps to control how long objects stays in CloudFront cache before expiring (by default cache expires after 24 hours)
To do that, either use Cache-Control header set by your origin server or set max,min and default Time-To-Live (TTL) for objects in CloudFront distribution.
U can also remove objects from CloudFront edge locations by calling invalidation API. This removes objects regardless of expiration period.
Invalidation feature should be used in unexpected circumstances such as to correct an error or to make unanticipated update to a website.
Instead of using invalidation, using version is the best practice. With versioning, users always see the latest content from CloudFront. Old versions expires automatically.


Cache Behaviors

CloudFront distribution can be used to serve dynamic content in addition to static content, to use more than 1 origin server.
U can control which requests are served by which origin and how requests are cached using a feature called cache behaviors.
For example, 1 cache behavior applies to all PHP files using path pattern *.php while another behavior applies to all JPEG images using file pattern *.jpg
Cache behaviors are applied in order, if a request does not match the first pattern it drops down to the next path pattern.


Private Contents

CloudFront provides several mechanisms to allow u to serve private content

Signed URLs: URLs that are valid only between certain times and optionally from certain IP addresses
Signed Cookies: Require authentication using public and private key pairs.
Origin Access Identities(OAI): Restrict access to S3 bucket only to a special CloudFront user associated with ur distribution.


CloudFront use case

Distributing software or other large files
serving streaming media


When not to use CloudFront

When all or most requests come from a single location
When all or most requests come through a corporate VPN


Amazon Storage Gateway

A service connecting an on-premise software appliance with cloud-based storage to provide seamless and secured integration between on-premise IT environment and AWS storage infrastructure.
AWS Storage Gateway software appliance is available for download as a Virtual Machine (VM) image that can be installed on a host in on-premise data center and then registered with AWS account through AWS Management Console. The storage associated with the appliance is exposed as an iSCSI device that can be mounted by your on-premises applications.
3 possible configurations for AWS Storage Gateway

Gateway-cached volume
Gateway-stored volume
Gateway virtual tape libraries (VTL)


Gateway-cached volume

All data stored in Gateway-cached Volume is moved to Amazon S3 while recently read data is retained in local storage to provide low-latency access.
Data stored in S3 cannt be accessed directly; has to be accessed through AWS Storage Gateway only.
While each volume is limited to max size of 32TB, a single gatway can support upto 32 volume for max storage of 1 PB.


Gateway-stored volume

Stored data on on-premise storage and asynchronously back up that data in Amazon S3.
Provides low-latency to all data while also providing off-site backups thru S3.
Data is backed up in form of EBS snapshot.
Each volume is limited to 16TB, and single gateway can support upto 32 volume for max storage of 512 TB


Gateway virtual tape libraries (VTL)

VTL interface lets you leverage existing tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your Gateway-VTL.
Virtual tape is analogous to physical tape cartridges, except data is stored on AWS cloud.


AWS Directory Service

AWS Directory Service is a managed service offering that provides directories that contain information about ur organization, including users, groups, computers and other resources.
3 directory types

AWS Directory Service for Microsoft AD
Simple AD
AD Connector


AWS Directory Service for Microsoft AD

is a managed Microsoft Active Directory hosted on AWS cloud.
Provides functionality of Microsoft AD plus integration to AWS applications.
Easy to setup trust relationship with your existing AD domains and extend those directories to AWS Cloud services.
Use this for more than 5000 users and need a trust relationship set up between AWS hosted directory and on-premise directories.


Simple AD

Powered by Samba 4, compatible with Microsoft AD.
You cannt set trust relationship between Simple AD and other AD domains.
Least expensive option, use for less than 5000 users.


AD Connector

A proxy service for connecting on-premise Microsoft AD to AWS Cloud without cost and complexity of hosting federation infrastructure.
Best choce when u want to use existing on-premise directory with AWS cloud services.


AWS Key Management Service(KMS) and CloudHSM

Services that help manage your own symmetric or asymmetric cryptographic keys
AWS Key Management Service(KMS)

makes it easy to create and control encryption keys used to encrypt your data
Enables you to maintain control over who can user your keys and gain access to your encrypted data.
Customer Managed Keys

KMS uses a type of key called Customer Managed Key(CMK) to encrypt and decrypt data
Can be used inside KMS to encrypt and decrypt upto 4 KB of data
Can also be used to encrypt generated data keys that are then used to encrypt/decrypt large amount of data outside of KMS


Data Keys

Used to encrypt large data objects within your own application outside AWS KMS.


Envelope Encryption

KMS uses envelope encryption to protect data.


Encryption Context

All KMS cryptographic operations accept an optional key/value map of additional contextual information called an encryption context.


AWS CloudHSM

Helps meet corporate, contractual and regulatory compliance requirements for data security by using dedicated HSM appliances within the AWS cloud.
HSM is a hardware appliance that provides secure key storage and cyptographic operations within a tamper-resistant hardware module.


AWS CloudTrail

Provides visibility into user activity by recording API calls made on your account.
Record important information about each API call, including name of API, identity of caller, time of API call, request parameters and response elements returned by AWS service.
Helps to ensure compliance with internal policies and regulatory standards.
CloudTrail typicaly delivers log files within 15 min of API calls.
A trail is a configuration that enables logging of AWS API activity and related events in your account. Can be created thru CloudTrail console, AWS CLI or CloudTrail API.
2 types of trails

A trail that applies to all regions

CloudTrail create this trail in each regions, records log files in each region and delivers the log files to single S3 bucket.
This is default option when you create a trail using CloudTrail console.


A trail that applies to 1 regions

A bucket that receives events only from a single region.
The bucket can be in any region that you specify.


Instead of configuring a trail for only 1 region, best practise is to enable trails for all regions.


Amazon Kinesis

Platform to handle massive streaming data on AWS and to build custom streaming data applications.
Kinesis stream store data for 24 hours by default; but can be increased to 7 days
Producers > Data records > Shards > Consumers
Shards

uniquely identified group of records in stream
each shard has a fixed unit of capacity


Partition key

Used to group data in stream into shard

Partition key associated with each data record used to determine which shard a given data record will belong to


Partition keys are specified by applications that put data into stream


Sequence Number

Each data record has a unique sequence number

Assigned by Streams after you write to the stream


Data Blob
Actual data producer adds to a stream.
Max size of blob is 1 MB


3 service

Amazon Kinesis Firehose
Amazon kinesis Streams
Amazon Kinesis Analytics


Amazon Kinesis Firehose

Receives stream data and stores it in S3, RedShift or ElastricSearch.
No need to write code, just create a delivery stream and configure the destination for your data.
Client write data to the stream using AWS API and the data is automatically sent to proper destination.
In case of S3, data is directly sent to S3
In case of Redshift, data is first written to S3 and then from there loaded to Redshift.
In case of ElasticSearch, data is backed up concurrently in S3


Amazon kinesis Streams

U can create Kinesis Streams application that process the data as it moves thru the stream
Can scale to support limitless data stream by distributing incoming data across number of shards. If any shard becomes too busy, it can further be divided into more shards to distribute load further.
While Kinesis is ideally suited for ingesting and processing streams of data, it is less appropriate for batch jobs such as Extract, Transform and Load(ETL) processes. For those types of workloads, consider AWS Data Pipeline.


Amazon Elastic MapReduce

When u launch EMR cluster, you can specify several options

Instance type of nodes in the cluster
Number of node in the cluster
Version of Hadoop to run
Additional tools like Hive, Pig, Spark


There are 2 types of storage that can be used with EMR

Hadoop Distributed File System(HDFS)

Can use EC2 instance storage or EBS for HDFS


EMR File System(EMRFS)

implementation of HDFS that allows clusters to store data on Amazon S3.


Key factor that decides which storage to use is whether the cluster is persistent or transient

Persistent cluster continues to run 24x7 after it is launched. For that HDFS storage is best suited.
Transient clusters are started when needed and then immediately stopped when done. For that EMRFS is well suited.


AWS Data Pipeline

Service that helps to reliable process and move data between different AWS compute and storage services and also on-premise data sources at specified intervals.
A pipeline schedules and runs tasks according to pipeline definition.
Tasks may need additional resources to run such as EMR cluster or EC2 instance. In that case, AWS Data Pipeline will automatically launch required resources and tear them down when task is complete.
AWS Data Pipeline is best suited for regular batch processes instead of continuous data streams; for which Kinesis is there.


AWS Import/Export

Used to get huge datasets to the cloud, or retrieving them back to on-premise when needed.
2 features

AWS Import/Export Snowball
AWS Import/Export Disk


AWS Import/Export Snowball

Uses Amazon provided shippable storage appliances shipped through UPS.
Encryption is enforced
You dont have to buy or maintain your own hardware devices
U can export/import terabytes or even petabytes of data


AWS Import/Export Disk

Transfer data directly onto and off storage devices you own.
U can import from S3, Glacier and EBS
U can export from S3
Has a upper limit of 16TB


AWS OpsWorks

A configuration management service that help you configure and operate applications using chef.
Stack is group of resources such as EC2 instances, RDS instances which forms a solution in AWS. These resources has to be create and managed collectively.
OpsWorks provide a simple and flexible way to create and manage stacks and applications.
Define elements of a stack by adding 1 or more layers. A layer represents a set of resource that serve a particular purpose, such as load balancing, web applications or hosting a database server. Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying applications, and running scripts.
One of the key OpsWorks features is a set of lifecycle events that automatically run a specified set of recipes at appropiate time on each instance.
Application and related files are stored in a repository, such as S3 bucket or Git repo. Each application is represented by an app, which specifies the application type and contains information that is needed to deploy the application from the repository to your instances. When you deploy an app, AWS OpsWorks triggers a Deploy event, which run the Deploy recipes on the stack's instances.
OpsWorks sends all resource metrics to CloudWatch making it easy to view graphs and set alarms to help troubleshoot and take automated action based on the state of resources.
OpsWorks provides many metrics such as CPU idle, memory total, average load for 1 minute and more.


AWS CloudFormation

Allows organization to deploy, modify and update resources in a controlled and predictable way, in effect applying version control to AWS infrastructure the same way as software.
In CloudFormation, you work with templates and stacks.
A template is a text file (complies with JSON standard) that defines the blueprints for building AWS resources. Using templates u can setup resources consistently and repeatedly.
In templates you dont have to worry about provisioning order or dependencies.
Related resources are managed as a single unit called stack. All resources in a stack is defined in CloudFormation template.
If stack creation fails, CloudFormation rolls back changes by deleting the resources that it created.
You will be charged for resources provisioned even if there is an error
Parameters can be used to customize aspects of template at runtime when stack is build. For example RDS database size, EC2 instance type etc.
To update a stack, create a change set by submitting  a modified version of original stack template, different input paraemeter values or both. CloudFormation compares the modified template with original template and generates a change set. The change set lists the proposed changes. After reviewing the changes, you execute the change set.
If u want to delete a stack but still retain some resources in that stack, you can use a deletion policy to retain those resources. If a resource has no deletion policy, CloudFormation deletes the resource by default.
CloudFormation doesnt have any additional cost, but you are charged for underlying resources it builds.
Intrinsic functions

Fn::GetAtt
Fn::Select


AWS Elastic Beanstalk

Fastest and simplest way to get an application up and running on AWS.
U can simply upload application code, the service automatically handles all details including resource provisioning, load balancing, monitoring and auto scaling.
An Elastic Beanstalk application is the logical collection of Elastic Beanstalk components, which includes environments, versions, and environment configurations. Its conceptually similar to a folder.
An application version refers to a specific, labeled iteration of deployable code of a web application. It points to a S3 object that contains the deployable code.
An environment is an application version that is deployed onto AWS resources. Each environment runs only on a single application version at a time, however the same version or different versions can run in as many environment at the same time as needed.
An environment configuration identifies a collection of parameters and settings that define how an environment and its associated resources behave.
Elastic Beanstalk is fault tolerant within a single region, but not FT between regions
Elastic Beanstalk deployment is publicly accessible


AWS Trusted Advisor

Inspects AWS environment and gives recommendations to save money,improve availability and performance, or close security gaps.
Provides best practise of following categories

Cost optimization
Security
Fault tolerance
Performance Improvement


All AWS customers can access 4 Trusted Advisor checks at no costs

Service Limits
Security Groups
IAM Use
MFA on Root Account


Customers with Business or Enterprise AWS support plan can view all AWS Trusted Advisor checks.


AWS Config

Service that provides resource inventory, configuration history and configuration change notifications to enable security and governance.
U can discover existing and deleted AWS resources, check compliance against rules and see configuration details of a resource at any point in time.
It first discovers supported AWS resources that exists in ur account and generates a configuration item for each resource. A configuration item represents a point-of-view of various attributes of a supported AWS resource that exists in your account.
Configuration Recorder stores configuration of supported resources in your account as configuration item. U need to start the configuration recorder to start recording of configuration item.
AWS Config Rule represents desired configuration setting for a AWS resource or for an entire AWS account. If a resource violates a rule, AWS Config flags the resource and notifies using SNS


12.Security on AWS


Core applications are deployed in an N+1 configuration, so that in the event of a data center failure, there is sufficient capacity to enable traffic to be load-balanced to the remaining sites.
Man-in-Middle attack:

All AWS APIs are available via SSL-protected endpoints that provides server authentication.
EC2 AMI automatically generate SSH host certificate on first boot and log them to instance console.


IP Spoofing: AWS controlled, host-based firewall infrastructure doesnt permit an instance to send traffic with a source IP or Machine Access Control(MAC) address other than its own.
Post Scanning: Unauthorized post scans by Amazon EC2 customers are a violation of the AWS Acceptable Use Policy.
Even 2 virtaul instances that are owned by the same customer located on the same physical host cannt listen to each other's traffic.
It is not possible for a virtual instance running in promiscuous mode to receive or sniff traffic that is intended for a different virtual instance.
If your credentials have been lost or forgotten, you cannot recover them or re-download them. However, you can create new credentials and then disable or delete the old set of credentials. In fact, AWS recommends that you change (rotate) your access keys and certificates on a regular basis. To help ybu his ial impact to your application’s availability, AWS supports multiple concurrent access keys and certificates.
AWS MFA support both hardware and virtual MFA devices. However  hardware devices are more secure. A 6 digit single use code has to be provided along with user/password gain access.
MFA uses Time based One time password (TOTP) protocol.
Access Keys

Access keys are created by IAM as a pair: the Access Key ID (AKI) and the Secret Access Key (SAK).
All API requests has to be signed by the SAK; that is, they must include a digital signature that AWS can use to verify the identity of the requestor.
Not only does the signing process help protect message integrity by preventing tampering with the request while it is in transit, but it also helps protect against potential replay attacks. A request must reach AWS within 15 minutes of the timestamp in the request.
Digital signature version 4 is the latest version of digital signature.


IAM Role

IAM roles provide temporary credentials, which not only get automatically loaded to the target instance, but are also automatically rotated multiple times a day.
Amazon EC2 uses an Instance Profile as a container for an IAM role. When you create an IAM role using the AWS Management Console, the console creates an instance profile automatically and gives it the same name as the role to which it corresponds. If you use the AWS CLI, API, or an AWS SDK to create a role, you create the role and instance profile as separate actions, and you might give them different names. To launch an instance with an IAM role, you specify the name of its instance profile. When you launch an instance using the Amazon EC2 console, you can select a role to associate with the instance; however, the list that’s displayed is actually a list of instance profile names.


Key pairs

On a Linux instance, access is granted through showing possession of the SSH private key.
On a Windows instance, access is granted by showing possession of the SSH private key in order to decrypt the administrator password.
The public key is embedded in EC2 instance, use the private key to sign in securely without a password.
For CloudFront, key pairs can be used for pre-signed URL. CloudFront key pairs can be created by root account, cannt be created by IAM Users.


X.509 certificates are used to sign SOAP-based requests.
CloudTrail

Records following details for each API call

name of API
identity of caller
time of API call
request parameters
response elements


CloudTrail supports log file integrity, which means that u can prove to 3rd party that log files are not changed.
This feature is build using SHA-256 for hashing and SHA-256 with RSA for digital signature.


EC2 security

Security within Amazon EC2 is provided on multiple levels: Host OS, Guest OS, firewall, and signed API calls.
Different instances running on the same physical machine are isolated from each other via the Xen hypervisor.
AWS firewall resides within the hypervisor layer, between the physical network interface and the instance’s virtual interface.
AWS proprietary disk virtualization layer automatically resets every block of storage used by the customer, so that one customer’s data is never unintentionally exposed to another customer. In addition, memory allocated to guests is scrubbed (set to zero) by the hypervisor when it is unallocated to a guest. The memory is not returned to the pool of free memory available for new allocations until the memory scrubbing is completed


EBS Security

Amazon EBS volume access is restricted to the AWS account that created the volume and to the users under the AWS account created with AWS IAM.
AWS provides the ability to encrypt Amazon EBS volumes and their snapshots with Advanced Encryption Standard (AES)-256.
EBS replication is stored with in same Availability Zone, not across multiple zones. So best practice is to take regular snapshots.
AWS does not automatically perform backup of data that are maintained on virtual disks attached to running EC2 instance
Sharing Amazon EBS volume snapshots does not provide other AWS accounts with the permission to alter or delete the original snapshot, as that right is explicitly reserved for the AWS account that created the volume.


ELB security

Supports end-to-end traffic encryption using TLS (previously SSL) on those networks that use secure HTTP (HTTPS) connections. When TLS is used, the TLS server certificate used to terminate client connections can be managed centrally on the load balancer, instead of on every individual instance.
If you want to use SSL protocol, but dont want to terminate connection on load balancer, you can use a TCP protocol for connection from client to your load balancer. Use SSL protocol for connection from load balancer to back-end application and install certificate on all backend instances.
To help ensure the use of newer and stronger cipher suites when establishing a secure connection, you can configure the load balancer to have the final say in the cipher suite selection during the client-server negotiation. When the Server Order Preference option is selected, the load balancer will select a cipher suite based on the server’s prioritization of cipher suites instead of the client’s.
ELB allow use of Perfect Forward Secrecy, which uses session keys that are ephemeral and not stored anywhere. This prevents the decoding of captured data, even if the secret long-term key itself is compromised.
ELB access logs contain information about each HTTP and TCP request processed by your load balancer. This includes the IP address and port of the requesting client, the back-end IP address of the instance that processed the request, the size of the request and response, and the actual request line from the client. All requests sent to the load balancer are logged, including requests that never make it to back-end instances.


VPC security

You must create security groups specifically for your Amazon VPC; any Amazon EC2 security groups you have created will not work inside your Amazon VPC.
VPC security groups have additional capabilities that Amazon EC2 security groups do not have, such as being able to change the security group after the instance is launched and being able to specify any protocol with a standard protocol number.


CloudFront security

If you want control over who can download content from Amazon CloudFront, you can enable the service’s private content feature. This feature has two components. The first controls how content is delivered from the Amazon CloudFront edge location to viewers on the Internet. The second controls how the Amazon CloudFront edge locations access objects in Amazon S3. Amazon CloudFront also supports geo restriction, which restricts access to your content based on the geographic location of your viewers.
To control access to the original copies of your objects in Amazon S3, Amazon CloudFront allows you to create one or more Origin Access Identities and associate these with your distributions. When an Origin Access Identity is associated with an Amazon CloudFront distribution, the distribution will use that identity to retrieve objects from Amazon S3. You can then use Amazon S3’s ACL feature, which limits access to that Origin Access Identity so the original copy of the object is not publicly readable.
To control who can download objects from Amazon CloudFront edge locations, the service uses a signed-URL verification system.
Amazon CloudFront provides the option to transfer content over an encrypted connection (HTTPS). By default, Amazon CloudFront will accept requests over both HTTP and HTTPS protocols. However, you can also configure Amazon CloudFront to require HTTPS for all requests or have Amazon CloudFront redirect HTTP requests to HTTPS. You can even configure Amazon CloudFront distributions to allow HTTP for some objects but require HTTPS for other objects.


S3 Security

With IAM policies, you can only grant users within your own AWS account permission to access your Amazon S3 resources.
With bucket policies, you can grant users within your AWS account or other AWS accounts access to your Amazon S3 resources
AWS customers who use Amazon S3 to host static web pages or store objects used by other web pages can load content securely by configuring an Amazon S3 bucket to explicitly enable cross origin resource sharing (CORS). With the Cross-Origin Resource Sharing (CORS)policy enabled, assets such as web fonts and images stored in an Amazon S3 bucket can be safely referenced by external web pages, style sheets, and HTML5 applications.
Amazon S3 provides multiple options for protecting data at rest. For customers who prefer to manage their own encryption, they can use a client encryption library like the Amazon S3 Encryption Client to encrypt data before uploading to Amazon S3. Alternatively, you can use Amazon S3 Server Side Encryption (SSE) if you prefer to have Amazon S3 manage the encryption process for you.
Note that metadata, which you can include with your object, is not encrypted. AWS recommends that customers not place sensitive information in Amazon S3
metadata.


DynamoDB Security

All data items are stored on Solid State Drives (SSDs) and automatically replicated across multiple Availability Zones in a region to provide built-in high availability and data durability.
you can also control access at the database level—you can create database-level permissions that allow or deny access to items (rows) and attributes (columns) based on the needs of your application. These database-level permissions are called fine-grained access controls, and you create them using an IAM policy that specifies under what circumstances a user or application can access an Amazon DynamoDB table. The IAM policy can restrict access to individual items in a table, access to the attributes in those items, or both at the same time.


RDS Security

When you first create a DB Instance within Amazon RDS, you will create a master user account, which is used only within the context of Amazon RDS to control access to your DB Instance(s).
The master user account is a native database user account that allows you to log on to your DB Instance with all database privileges.
Amazon RDS supports Transparent Data Encryption (TDE) for SQL Server (SQL Server Enterprise Edition) and Oracle (part of the Oracle Advanced Security option available in Oracle Enterprise Edition).


13.AWS Risk and Compliance


Customers dont communicate their environment and configuration to AWS, but AWS does communicate with customers regarding its security and control environment. Using 3 mechanisms:

AWS obtains industry certifications and independent 3rd party attestations.
AWS openly publishes information about its security and control practices in whitepapers and website contents.
AWS provides certificates, reports and other documentation directly to customers under NDA


AWS is responsible for "security of the cloud", customer is responsible for "security in the cloud".
Shared responsibility model is not just limited to security considerations; it also extends to IT controls. AWS manages these control where it relates to the physical infrastructure, and the customer manages these controls for the guest opertaing systems and upward.
It is still customer's responsibility to maintain adequate governance over the entire IT control environment, regardless of how their IT is deployed(on-premises,cloud or hybrid)
AWS publishes Service Organization Controls (SOC1) Type II report, a widely recognished auditing standard.
ISO 27001 certification: security standard
Payment Card Industry(PCI) Data Security Standard(DSS): controls important to companies that handle credit card information.
Federal Information Security Management Act(FISMA): controls required by US government agencies
3 core areas of risk and compliance program

risk management
control environment
information security


Risk management

AWS security team regularly scan any public-facing endpoint IP addresses for vulnerabilities.These scans doesnt include customer instances.
Customers can request permission to conduct their own vulnerability scans on their own environments.


Control environment

Includes people, process and technology necessary to establish and maintain an environment that supports operating effectiveness of AWS control framework.


14.Architecture Best Practices


Best practices

Design for failure and nothing will fail.
Implement elasticity
Leverage different storage options
Build security in every layer
Think parallel
Loose coupling sets you free
Dont fear constraints


Redundancy can be implemented in 2 modes

Passive or standby:

In this mode when a resource fails, functionality is recovered on a secondary resource using a process called failover.
Failover typically require some time, during which period resouce remain unavailable.
Secondary can be either launched automatically when needed to reduce cost, or it can be already running idle.
Standby redundancy is usually used for stateful components such as relational database.


Active mode

Requests are distributed to multiple redundant computer resources and when 1 fails the rest can simply absorb a larger share of the workload.
Compared to standby mode, it can achieve better utilization and affect a smmaller population.


Implement Elasticity

Vertical scaling takes place through an increase in the specification of individual resource.
A stateless application can scale horizontally.
Consider storing a unique session identifier in a HTTP cookie and storing more detailed user session information in server-side. Amazon DynamoDB is a great choice for that.
By definition, databases are stateful.


Build security in every layer

AWS provides a wealth of features that help build defense in depth.
AWS Web Application Firewall(AWS WAF) helps protect your web applications from SQL injection and other vulnerabilities in application code.
Use IAM Roles to grant permissions to applications running on EC2 instances through the use of temporary security tokens. These credentials are automatically distributed and rotated.
For mobile applications, use of Amazon Cognito allows client devices to get controlled access to AWS resources via temporary tokens.
You can capture security features (like firewall rules, NACLs, subnets) in a script and define a "Golden Environment". CloudFormation scripts can capture this and can be reused.


Loose coupling

Amazon API Gateway provides a way to expose well-defined interfaces.


15.Miscelleneous


A Solutions Architect is designing an online shopping application running in a VPC on EC2 instances behind an ELB Application Load Balancer. The instances run in an Auto Scaling group across multiple Availability Zones. The application tier must read and write data to a customer managed database cluster. There should be no access to the database from the Internet, but the cluster must be able to obtain software patches from the Internet.Which VPC design meets these requirements?

Public subnets for the application tier and NAT Gateway, and private subnets for the database cluster


In AWS Aurora, Read Replicas are separate database instances that are asychronously updated when source DB instance changes.
Distributing load to multiple nodes

Push Model: Use a load balancing solution like ELB to route incoming requests across multiple EC2 instances
Pull Model: Tasks that need to be performed or data that need to be processed could be stored as messages in SQS or as a streaming data solution like Kinesis.


For HTTP/S traffic, session affinity can be achieved through the "sticky sessions" feature of ELB.
Configuration drift: Changes and software patches applied over time can result in untested and heterogeneous configurations across different environments. Use immutable infrastructure pattern, a server once launched is never updated throughour lifecycle.Instead when update is needed the server is replaced with a new one that has latest configuration.
Approaches to achieve automation: a) Bootstrapping b) Golden Images c) Hybrid
Bootstrapping: Use data scripts or AWS OpsWorks lifecycle events to automatically setup EC2 instances. U can use simple scripts or configuration management tools like Chef or Puppet.
Golden Image: Some AWS resources like EC2 instances, RDS DB instances, EBS volumes etc. can be launched from a golden image; a snapshot of a particular state of that resource. When compared to bootstrapping, golden image results in faster start time and removes dependencies to configuration services or 3rd party repositories.
Items that do not change often or that introduces external dependencies will typically be part of golden image. While items that changes often or differ between various environments can be set up dynamically through bootstrapping actions.
Amazon EC2 Auto Recovery: You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers it if it becomes impaired. A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata. However, this feature is only available for applicable instance configurations
How to implement service discovery

For an Amazon EC2 hosted service a simple way to achieve service discovery is through the Elastic Load Balancing service.
Another option would be to use a service registration and discovery method to allow retrieval of the endpoint IP addresses and port number of any given service. Use tools like Netflix Eureka, Consul.


If your application primarily indexes and queries data with no need for joins or complex transactions (especially if you expect a write throughput beyond the constraints of a single instance) consider a NoSQL database instead.
If your schema cannot be denormalized and your application requires joins or complex transactions, consider a relational database instead.
On AWS, for search capabilities you have the choice between Amazon CloudSearch and Amazon Elasticsearch Service (Amazon ES). On the one hand, Amazon CloudSearch is a managed service that requires little configuration and will scale automatically. On the other hand, Amazon ES offers an open source API and gives you more control over the configuration details. Amazon ES has also evolved to become a lot more than just a search solution. It is often used as an analytics engine for use cases such as log analytics, real-time application monitoring, and click stream analytics
Redis engine for Amazon ElastiCache supports replication with automatic failover, but the Redis engine’s replication is asynchronous. During a failover, it is highly likely that some recent transactions would be lost. However, Amazon RDS, with its Multi AZ feature, is designed to provide synchronous replication to keep data on the standby node up-to-date with the primary.
Quorum-based replication combines synchronous and asynchronous replication to overcome the challenges of large-scale distributed database systems. Replication to multiple nodes can be managed by defining a minimum number of nodes that must participate in a successful write operation
Shuffle Sharding: One fault-isolating improvement you can make to traditional horizontal scaling is called sharding. Similar to the technique traditionally used with data storage systems, instead of spreading traffic from all customers across every node, you can group the instances into shards. For example, if you have eight instances for your service, you might create four shards of two instances each (two instances for some redundancy within each shard) and distribute each customer to a specific shard. In this way, you are able to reduce the impact on customers in direct proportion to the number of shards you have.
Optimize for Cost

Right sizing

AWS offers a broad range of resource types and configurations to suit a plethora of use cases. For example, services like Amazon EC2, Amazon RDS, Amazon Redshift, and Amazon Elasticsearch Service (Amazon ES) give you a lot of choice of instance types. In some cases, you should select the cheapest type that suits your workload’s requirements. In other cases, using fewer instances of a larger instance type might result in lower total cost or better performance.
you can reduce cost by selecting the right storage solution for your needs


Elasticity

Plan to implement Auto Scaling for as many Amazon EC2 workloads as possible, so that you horizontally scale up when needed and scale down and automatically reduce your spend when you don’t need all that capacity anymore.
consider which compute workloads you could implement on AWS Lambda so that you never pay for idle or redundant resources
replace Amazon EC2 workloads with AWS managed services that either don’t require you to take any capacity decisions (e.g., ELB, Amazon CloudFront, Amazon SQS, Amazon Kinesis Firehose, AWS Lambda, Amazon SES, Amazon CloudSearch) or enable you to easily modify capacity as and when need (e.g., Amazon DynamoDB, Amazon RDS, Amazon Elasticsearch Service).


Take advantage of purchase options

Reserved Capacity

This is ideal for applications with predictable minimum capacity requirements. You can take advantage of tools like the AWS Trusted Advisor or Amazon EC2 usage reports to identify the compute resources that you use most of the time that you should consider reserving.
Reserved capacity options exist for other services as well (e.g., Amazon Redshift, Amazon RDS, Amazon DynamoDB, and Amazon CloudFront).


Spot Instances

Spot Instances are ideal for workloads that have flexible start and end times. Your Spot Instance is launched when your bid exceeds the current Spot market price, and will continue run until you choose to terminate it, or until the Spot market price exceeds your bid
You are charged the Spot market price (not your bid price) for as long as the Spot Instance runs.


Spot Blocks for Defined-Duration Workloads: You can also bid for fixed duration Spot Instances. These have different hourly pricing but allow you to specify a duration requirement. If your bid is accepted your instance will continue to run until you choose to terminate it, or until the specified duration has ended; your instance will not be terminated due to changes in the Spot price


Caching

Application Data Caching: Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.
Edge Caching: Copies of static content (e.g., images, css files, streaming of pre-recorded video) and dynamic content (e.g., html response, live video) can be cached at Amazon CloudFront, which is a content delivery network (CDN) consisting of multiple edge locations around the world. Edge caching allows content to be served by infrastructure that is closer to viewers, lowering latency and giving you the high, sustained data transfer rates needed to deliver large popular objects to end users at scale.


Security

Utilize AWS features for Defense in Depth

AWS provides a wealth of features that can help architects build defense in depth. Starting at the network level you can build a VPC topology that isolates parts of the infrastructure through the use of subnets, security groups, and routing controls
Services like AWS WAF, a web application firewall, can help protect your web applications from SQL injection and other vulnerabilities in your application code


Offload Security Repsonsibilities to AWS: For example, when you use services such as Amazon RDS, Amazon ElastiCache, Amazon CloudSearch, etc., security patches become the responsibility of AWS. This not only reduces operational overhead for your team, but it could also reduce your exposure to vulnerabilities
Reduces Privileged Access
Security as Code: Traditional security frameworks, regulations, and organizational policies define security requirements related to things such as firewall rules, network access controls, internal/external subnets, and operating system hardening. You can implement these in an AWS environment as well, but you now have the opportunity to capture them all in a script that defines a “Golden Environment.” This means you can create an AWS CloudFormation script that captures your security policy and reliably deploys it. Security best practices can now be reused among multiple projects and become part of your continuous integration pipeline.
Real time Auditing


AWS Lambda

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume; there is no charge when your code is not running.
You cannot access the infrastructure that AWS Lambda runs on
Each AWS Lambda function runs in its own isolated environment, with its own resources and file system view.
AWS Lambda stores code in Amazon S3 and encrypts it at rest. AWS Lambda performs additional integrity checks while your code is in use.
The code you run on AWS Lambda is uploaded as a “Lambda function”.
Each function has associated configuration information, such as its name, description, entry point, and resource requirements.
The code must be written in a “stateless” style i.e. it should assume there is no affinity to the underlying compute infrastructure.
Local file system access, child processes, and similar artifacts may not extend beyond the lifetime of the request, and any persistent state should be stored in Amazon S3, Amazon DynamoDB, or another Internet-available storage service.
Lambda functions can include libraries, even native ones.
To improve performance, AWS Lambda may choose to retain an instance of your function and reuse it to serve a subsequent request, rather than creating a new copy.
Each Lambda function receives 500MB of non-persistent disk space in its own /tmp directory.
AWS Lambda allows you to use normal language and operating system features, such as creating additional threads and processes. Resources allocated to the Lambda function, including memory, execution time, disk, and network use, must be shared among all the threads/processes it uses.
AWS Lambda support environment variables
Functions can access

AWS services or non-AWS services
AWS services running in VPCs (e.g. RedShift, Elasticache, RDS instances)
Non-AWS services running on EC2 instances in an AWS VPC


Lambda function code restrictions:

Inbound network connections are blocked by AWS Lambda
Outbound connections-  only TCP/IP and UDP/IP sockets are supported,
ptrace (debugging) system calls are blocked.
TCP port 25 traffic is also blocked as an anti-spam measure.


Can easily list, delete, update, and monitor your Lambda functions using the dashboard in the AWS Lambda console
Can package any code (frameworks, SDKs, libraries, and more) as Lambda Layer and manage and share them easily across multiple functions.
AWS Lambda automatically monitors Lambda functions on your behalf, reporting real-time metrics through Amazon CloudWatch, including total requests, account-level and function-level concurrency usage, latency, error rates, and throttled requests. You can view statistics for each of your Lambda functions via the Amazon CloudWatch console or through the AWS Lambda console.
In the AWS Lambda resource model, you choose the amount of memory you want for your function, and are allocated proportional CPU power and other resources. You can set your memory in 64MB increments from 128MB to 3GB. Functions larger than 1536MB are allocated multiple CPU threads, and multi-threaded or multi-process code is needed to take advantage.
Each AWS Lambda function has a single, current version of the code. Clients of your Lambda function can call a specific version or get the latest implementation.
Lambda can pull records from an Amazon Kinesis stream or an Amazon SQS queue and execute a Lambda function for each fetched message.
You can invoke a Lambda function using a custom event through AWS Lambda’s invoke API. Only the function’s owner or another AWS account that the owner has granted permission can invoke the function.
On failure, Lambda functions being invoked synchronously will respond with an exception.
For Amazon S3 bucket notifications and custom events, AWS Lambda will attempt execution of your function 3 times in the event of an error condition in your code or if you exceed a service or resource limit.
For ordered event sources that AWS Lambda polls on your behalf, such as Amazon DynamoDB Streams and Amazon Kinesis streams, Lambda will continue attempting execution in the event of a developer code error until the data expires.
You can also set Amazon CloudWatch alarms based on error or execution throttling rates.
On exceeding the retry policy for asynchronous invocations, you can configure a “dead letter queue” (DLQ) into which the event will be placed
To enable your Lambda function to access resources inside your private VPC, you must provide additional VPC-specific configuration information that includes VPC subnet IDs and security group IDs
Lambda functions provide access only to a single VPC. If multiple subnets are specified, they must all be in the same VPC
Lambda functions are serverless and independent, 1 event = 1 function
Functions can trigger other functions so 1 event can trigger multiple functions
For non stream-based event sources each published event is a unit of work, run in parallel up to your account limit (one Lambda function per event)
For stream-based event sources the number of shards indicates the unit of concurrency (one function per shard)
Each Lambda function has a unique Amazon Resource Name (ARN) which cannot be changed after publishing
serverless application : Lambda-based applications (also referred to as serverless applications) are composed of functions triggered by events. A typical serverless application consists of one or more functions triggered by events such as object uploads to Amazon S3, Amazon SNS notifications, or API actions. These functions can stand alone or leverage other resources such as DynamoDB tables or Amazon S3 buckets. The most basic serverless application is simply a function.
You can deploy and manage your serverless applications using the AWS Serverless Application Model (AWS SAM). AWS SAM is a specification that prescribes the rules for expressing serverless applications on AWS. This specification aligns with the syntax used by AWS CloudFormation today and is supported natively within AWS CloudFormation as a set of resource types (referred to as “serverless resources”)
AWS Serverless Application Repository has a collection of serverless applications published by developers, companies, and partners in the AWS community.
You can automate your serverless application’s release process using AWS CodePipeline and AWS CodeDeploy.
Use AWS Step Functions to coordinate a series of AWS Lambda functions in a specific order.
Price

Number of requests. First 1 million are free then $0.20 per 1 million
Duration. Calculated from the time your code begins execution until it returns or terminates. Depends on the amount of memory allocated to a function


Lambda@Edge allows you to run code across AWS locations globally without provisioning or managing servers, responding to end users at the lowest network latency. You just upload your Node.js code to AWS Lambda and configure your function to be triggered in response to Amazon CloudFront requests
Use cases

run your code in response to events, such as changes to data in an Amazon S3 bucket or an Amazon DynamoDB table
run your code in response to HTTP requests using Amazon API Gateway
invoke your code using API calls made using AWS SDKs.


AWS Web Application Firewall(WAF)

Web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
AWS WAF gives you control over which traffic to allow or block to your web applications by defining customizable web security rules. You can use AWS WAF to create custom rules that block common attack patterns, such as SQL injection or cross-site scripting, and rules that are designed for your specific application.
You can filter web requests based on IP addresses, HTTP headers, HTTP body, or URI strings, which allows you to block common attack patterns, such as SQL injection or cross-site scripting.
You can deploy AWS WAF on either Amazon CloudFront as part of your CDN solution or the Application Load Balancer (ALB) that fronts your web servers or origin servers running on EC2


Amazon Cognito

Amazon Cognito provides authentication, authorization, and user management for your web and mobile apps. Your users can sign in directly with a user name and password, or through a third party such as Facebook, Amazon, or Google.
The two main components of Amazon Cognito are User pools and Federated Identity.
A user pool is a user directory in Amazon Cognito. With a user pool, your users can sign in to your web or mobile app through Amazon Cognito. Your users can also sign in through social identity providers like Facebook or Amazon, and through SAML identity providers. Whether your users sign in directly or through a third party, all members of the user pool have a directory profile that you can access through an SDK
With an iFederated Identity, your users can obtain temporary AWS credentials to access AWS services, such as Amazon S3 and DynamoDB. Identity pools support anonymous guest users, as well as the following identity providers that you can use to authenticate users for identity pools
Difference between User Pool and Federated Identity?

UserPool gives access to a application, Federated Identities give access to amazon services


AWS Direct Connect

AWS Direct Connect makes it easy to establish a dedicated network connection from your premises to AWS. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.
AWS Direct Connect bypasses the public Internet and establishes a secure, dedicated connection from your infrastructure into AWS.
Direct Connect can be partitioned into multiple Virtual Interface (VIF)
Difference with VPN

VPN connectivity utilizes the public Internet, which can have unpredictable performance and despite being encrypted, can present security concerns


Amazon API Gateway

Is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. It handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management
Together with AWS Lambda, API Gateway forms the app-facing part of the AWS serverless infrastructure. For an app to call publicly available AWS services, you can use Lambda to interact with the required services and expose the Lambda functions through API methods in API Gateway. AWS Lambda runs the code on a highly available computing infrastructure.


Amazon Elastic Transcoder

Amazon Elastic Transcoder lets you convert media files that you have stored in Amazon Simple Storage Service (Amazon S3) into media files in the formats required by consumer playback devices. For example, you can convert large, high-quality digital media files into formats that users can play back on mobile devices, tablets, web browsers, and connected televisions


AWS Certificate Manager

AWS Certificate Manager is a service that lets you easily provision, manage, and deploy public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services and your internal connected resources. SSL/TLS certificates are used to secure network communications and establish the identity of websites over the Internet as well as resources on private networks.
AWS Certificate Manager removes the time-consuming manual process of purchasing, uploading, and renewing SSL/TLS certificates. With AWS Certificate Manager, you can quickly request a certificate, deploy it on ACM-integrated AWS resources, such as Elastic Load Balancers, Amazon CloudFront distributions, and APIs on API Gateway, and let AWS Certificate Manager handle certificate renewals.
Public and private certificates provisioned through AWS Certificate Manager for use with ACM-integrated services are free. You pay only for the AWS resources you create to run your application. For private certificates, you pay monthly for the operation of the private CA and for the private certificates you issue.


AWS Glue

AWS Glue is a fully managed, serverless ETL (extract, transform, and load) service that makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores.
You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL. AWS Glue generates the code to execute your data transformations and data loading processes.
AWS Glue generates code that is customizable, reusable, and portable. Once your ETL job is ready, you can schedule it to run on AWS Glue's fully managed, scale-out Apache Spark environment. AWS Glue provides a flexible scheduler with dependency resolution, job monitoring, and alerting.
Difference with RedShift: Redshift is a datawarehouse service, while Glue is an ETL service


AWS Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.
Athena is out-of-the-box integrated with AWS Glue Data Catalog


AWS Polly: Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Amazon Polly is a Text-to-Speech service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.
AWS CodeDeploy:

AWS CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, or serverless Lambda functions.
You can deploy a nearly unlimited variety of application content, such as code, serverless AWS Lambda functions, web and configuration files, executables, packages, scripts, multimedia files, and so on.
AWS CodeDeploy can deploy application content that runs on a server and is stored in Amazon S3 buckets, GitHub repositories, or Bitbucket repositories.


AWS CodePipeline: AWS CodePipeline is a continuous delivery service you can use to model, visualize, and automate the steps required to release your software. You can quickly model and configure the different stages of a software release process. AWS CodePipeline automates the steps required to release your software changes continuously.
AWS CodeBuild: AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue.
Amazon CloudSearch

Amazon CloudSearch is a managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application.
Backed by Apache Solr


Amazon ElasticSearch

AWS hosted ElasticSearch that takes care of set-up and management of the back end server and provides us with an endpoint that we can get developing with.
ElasticSearch is an open source product backed by Apache Lucene.
Use AWS ElasticSearch instead of Amazon CloudSearch, if u need to use the solution outside AWS


AWS VPN CloudHub

If you have multiple VPN connections, you can provide secure communication between sites using the AWS VPN CloudHub. This enables your remote sites to communicate with each other, and not just with the VPC. The VPN CloudHub operates on a simple hub-and-spoke model that you can use with or without a VPC. This design is suitable for customers with multiple branch offices and existing internet connections who'd like to implement a convenient, potentially low-cost hub-and-spoke model for primary or backup connectivity between these remote offices.


Placement Groups

Logical grouping of instances in a single availability zone (AZ)
Cannt span multiple AZ
Name has to be unique across AWS account.
Only supported on instances that support enhanced networking.
Existing instances cannt be moved to a placement group.
When a placement group is created, all instances have to be provisioned at the  same time
Placement groups cannt be merged


Elastic File System (EFS)

Simple, petabyte scalable file storage for use with EC2 instances
Elastic, and automatically grow and shrink as you add and remove files.
Stored redundantly across multiple Azs.
1-1000 EC2 instances from multiple Azs can concurrently access
Use cases: big data and analytics, media processing workflows, content management, web serving
By default, u can create upto 10 file systems per account per region.
Can be access from on-premise through Direct Connect.


IP addresses in CIDR block

To calculate the total number of IP addresses of a given CIDR Block, you simply need to follow the 2 easy steps below. Let's say you have a CIDR block /27:

Subtract 32 with the mask number :
(32 - 27) = 5
Raise the number 2 to the power of the answer in Step #1 :
2^ 5 = (2 * 2 * 2 * 2 * 2)  = 32


Unique of pre-built AMI can be used in a specific region only. This AMI is not accessible to another region hence,  you have to copy it to the us-west-2 region to properly establish your disaster recovery instance. You can copy an Amazon Machine Image (AMI) within or across an AWS region using the AWS Management Console, the AWS command line tools or SDKs, or the Amazon EC2 API, all of which support the CopyImage action. You can copy both Amazon EBS-backed AMIs and instance store-backed AMIs. You can copy encrypted AMIs and AMIs with encrypted snapshots.
Which storage services encrypts data at rest by default?

Storage Gateway
Glacier


An AWS account has an ID of 0499802888. Which of the following URLs would you provide to the IAM user to be able to access the AWS Console?

https://0499802888.signin.aws.amazon.com/console


The company that you are working for have instructed you to create a cost-effective cloud solution for their online movie ticketing service. Your team have designed a solution of using a fleet of Spot EC2 instances to host the new ticketing web application. You requested a spot instance at a maximum price of $0.06/hr which has been fulfilled immediately. After 45 minutes, the spot price increased to $0.08/hr and then your instance was terminated by AWS. What was the total EC2 compute cost of running your spot instances?

$0.00
If your Spot instance is terminated or stopped by Amazon EC2 in the first instance hour, you will not be charged for that usage. However, if you terminate the instance yourself, you will be charged to the nearest second.
If the Spot instance is terminated or stopped by Amazon EC2 in any subsequent hour, you will be charged for your usage to the nearest second. If you are running on Windows and you terminate the instance yourself, you will be charged for an entire hour.


Difference betweem KMS and CloudHSM

KMS uses only symmetric keys, Cloud HSM allows both symmetric and asymmetric keys
KMS is multi-tant while CloudHSM is dedicated service


e-gress only Internet Gateway

An egress-only Internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows outbound communication over IPv6 from instances in your VPC to the Internet, and prevents the Internet from initiating an IPv6 connection with your instances.
Take note that an egress-only Internet gateway is for use with IPv6 traffic only. To enable outbound-only Internet communication over IPv4, use a NAT gateway instead.


Disaster Recovery(DR) scenarios

Backup & Restore
Pilot Light: Minimal version of the application is always running on the cloud. With AWS you can maintain a pilot light by configuring and running the most critical core elements of your system in AWS. When the time comes for recovery, you can rapidly provision a full-scale production environment around the critical core.
Warm Standby: The term warm standby is used to describe a DR scenario in which a scaled-down version of a fully functional environment is always running in the cloud. A warm standby solution extends the pilot light elements and preparation. It further decreases the recovery time because some services are always running. By identifying your business-critical systems, you can fully duplicate these systems on AWS and have them always on.
Multi-site: A multi-site solution runs in AWS as well as on your existing on-site infrastructure, in an active-active configuration. You can use a DNS service that supports weighted routing, such as Amazon Route 53, to route production traffic to different sites that deliver the same application or service. A proportion of traffic will go to your infrastructure in AWS, and the remainder will go to your on-site infrastructure.


S3 One Zone-IA

Used for: 1) For infrequently accessed data. 2) Stores object data in only one Availability Zone at a lower price than Standard-IA 3) Minimum 30-day retention period 4) minimum 128 KB object size


DynamoDB Accelerator (DAX)

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache that can reduce Amazon DynamoDB response times from milliseconds to microseconds, even at millions of requests per second.
App > DAX > DynamoDB


Snowball Edge

a data migration and edge computing device that comes in following options

Snowball Edge Storage Optimized: provides 100 TB of capacity and 24 vCPUs and is well suited for local storage and large scale data transfer
Snowball Edge Compute Optimized: provides 52 vCPUs and an optional GPU for use cases such as advanced machine learning and full motion video analysis in disconnected environments. Comes with 42 TB of additional storage space
Snowball Edge Compute Optimized with GPU: identical to the compute optimized option, save for an installed GPU


Step Function

AWS Step Functions is a fully managed service that makes it easy to coordinate the components of distributed applications and microservices using visual workflows.
Step Functions automatically triggers and tracks each step, and retries when there are errors, so your application executes in order and as expected. Step Functions logs the state of each step, so when things do go wrong, you can diagnose and debug problems quickly.
You can change and add steps without even writing code, so you can easily evolve your application and innovate faster
How does step function work ?

you define state machines that describe your workflow as a series of steps, their relationships, and their inputs and outputs. State machines contain a number of states, each of which represents an individual step in a workflow diagram.
States can perform work, make choices, pass parameters, initiate parallel execution, manage timeouts, or terminate your workflow with a success or failure.
The visual console automatically graphs each state in the order of execution, making it easy to design multi-step applications.


Tasks

Activity tasks let you assign a specific step in your workflow to code running somewhere else (known as an activity worker). An activity worker can be any application that can make an HTTP connection, hosted anywhere. For example, activity workers can run on an Amazon EC2 instance, on a mobile device, or on an on-premises server.
Service tasks let you connect a step in your workflow to a supported AWS service. Step Functions pushes requests to other services so they can perform actions for your workflow, waits for the service task to complete, and then continues to the next step.


AWS Step Functions state machines are defined in JSON using the declarative Amazon States Language.
Comparison with SWF

You should consider using AWS Step Functions for all your new applications, since it provides a more productive and agile approach to coordinating application components using visual workflows. If you require external signals to intervene in your processes, or you would like to launch child processes that return a result to a parent


Step Functions counts a state transition each time a step of your workflow is executed. You are charged for the total number of state transitions across all your state machines, including retries.


API Gateway

Stage

Similar to tags
Define accessible path
Eg; following shows different stages of same API

https://myapi.com/dev/shoes
https://myapi.com/beta/shoes
https://myapi.com/v1/shoes


All calls are done via HTTPS, HTTP access is not allowed
Default APIs use AWS SSL certificate
Custom domain APIs can use your SSL certificate
All AWS APIs are publicly accessible;
Cannot communicate with services inside VPC unless a public endpoint is exposed. So if we have an API it cannt communicate with an EC2 instance in VPC. For that we need to attach an EIP to the instance and connect to that.
Throttling limits

Burst limit: max number of concurrent request API gateway will serve at any given point of time
Rate limit:  max number of request per second API gateway will serve


If throttling limit is crossed, requests get "429 Too Many" Requests response
How do the Rate and Burst Throttle work together?

The Burst setting and Rate setting work together to control how many requests can be processed by your API.
Let's assume you set the throttle to Rate = 100 (requests per second) and the Burst = 50 (requests). With those settings if 100 concurrent requests are sent at the exact same millisecond only 50 would be processed due to the burst setting, the remaining 50 requests would get a 429 Too Many Requests response. Assuming the first 50 requests completed in 100ms each, your client could then retry the remaining 50 requests.


What does this mean for API Gateways that invoke Lambda?

AWS Lambda Functions have a default maximum concurrency level of 1000 (you can request to have this increased if you need to), but the default burst levels on AWS API Gateway is way higher than this, so if you are using API Gateway with Lambda you will want to make sure that you have set a value for the Burst throttle setting that makes sense for your Lambda Concurrency level.


Throttling limits are set a stage level, but they can be overridden at individual method level
Caching API requests

Configured per stage
From 0.5 GB to 237 GB storage
Charged separately for storage


Billing charges

No of API requests
Data transferred out
Data cached


API Gateway access policies

By default access is denied, unless a policy exists allowing access
Policies can allow different access to different APIs, eg - allow access to DEV stage while restricting access to PROD
Policies consists of resource statement and action statements


While managing access to API calls, only AWS Signature Version 4 is supported.
Backend support

Amazon API Gateway can execute AWS Lambda functions in your account, start AWS Step Functions state machines, or call HTTP endpoints hosted on AWS Elastic Beanstalk, Amazon EC2, and also non-AWS hosted HTTP based operations that are accessible via the public Internet


Usage plan

Usage plans help you declare plans for third-party developers that restrict access only to certain APIs, define throttling and request quota limits, and associate them with API keys. You can also extract utilization data on an per-API key basis to analyze API usage and generate billing documents. For example, you can create a basic, professional, and enterprise plans – you can configure the basic usage plan to only allow 1,000 requests per day and a maximum of 5 requests per second (RPS)


Elastic Container Service (ECS)

A highly scalable, high performance container orchestration service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.
no additional charge for Amazon ECS
Docker is the only container platform supported by Amazon ECS at this time
ECS supports management of Windows containers.
With AWS Fargate, you no longer have to select Amazon EC2 instance types, provision and scale clusters, or patch and update each server. You do not have to worry about task placement strategies, such as binpacking or host spread and tasks are automatically balanced across availability zones. Fargate manages the availability of containers for you. You just define your application’s requirements, select Fargate as your launch type in the console or CLI, and Fargate takes care of all the scaling and infrastructure management required to run your containers.
Amazon Elastic Container Registry(ECR) is integrated with Amazon ECS allowing you to easily store, run, and manage container images for applications running on Amazon ECS.
The Amazon ECS CLI supports Docker Compose, an open-source tool for defining and running multi-container applications.
ECS CLI is open-source.
Amazon ECS allows you to define tasks through a declarative JSON template called a Task Definition. Within a Task Definition you can specify one or more containers that are required for your task, including the Docker repository and image, memory and CPU requirements, shared data volumes, and how the containers are linked to each other. You can launch as many tasks as you want from a single Task Definition file that you can register with the service. Task Definition files also allow you to have version control over your application specification.
Scheduling:includes multiple scheduling strategies that place containers across your clusters based on your resource needs (for example, CPU or RAM) and availability requirements.

Task scheduling: to run processes that perform work and then stop, such as batch processing jobs
Service scheduling: allows you to run stateless services and applications
Daemon scheduling: automatically runs the same task on each selected instance in your ECS cluster.


Linux AMI Virtualization Types

Linux Amazon Machine Images use one of two types of virtualization: paravirtual (PV) or hardware virtual machine (HVM). The main differences between PV and HVM AMIs are the way in which they boot and whether they can take advantage of special hardware extensions (CPU, network, and storage) for better performance.
For the best performance, we recommend that you use current generation instance types and HVM AMIs when you launch your instances.


Bastion Host

A bastion host is a server whose purpose is to provide access to a private network from an external network, such as the Internet. Because of its exposure to potential attack, a bastion host must minimize the chances of penetration.
The bastion host runs on an Amazon EC2 instance that is typically in a public subnet of your Amazon VPC.
Bastion hosts are instances that sit within your public subnet and are typically accessed using SSH or RDP.
Once remote connectivity has been established with the bastion host, it then acts as a ‘jump’ server,  allowing you to use SSH or RDP to login to other instances (within private subnets) deeper within your network.


Security Token Service(STS)

The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users).
work almost identically to the long-term access key credentials that your IAM users can use, with the following differences:

Temporary security credentials are short-term, as the name implies. They can be configured to last for anywhere from a few minutes to several hours. After the credentials expire, AWS no longer recognizes them or allows any kind of access from API requests made with them.
Temporary security credentials are not stored with the user but are generated dynamically and provided to the user when requested. When (or even before) the temporary security credentials expire, the user can request new credentials, as long as the user requesting them still has permissions to do so.


These differences lead to the following advantages for using temporary credentials:

You do not have to distribute or embed long-term AWS security credentials with an application.
You can provide access to your AWS resources to users without having to define an AWS identity for them. Temporary credentials are the basis for roles and identity federation.
The temporary security credentials have a limited lifetime, so you do not have to rotate them or explicitly revoke them when they're no longer needed. After temporary security credentials expire, they cannot be reused. You can specify how long the credentials are valid, up to a maximum limit.


By default, AWS STS is a global service with a single endpoint at https://sts.amazonaws.com.
An IAM user can request these temporary security credentials for their own use or hand them out to federated users or applications. When requesting temporary security credentials for federated users, you must provide a user name and an IAM policy defining the permissions you want to associate with these temporary security credentials. The federated user cannot get more permissions than the parent IAM user who requested the temporary credentials.