Skip to content

Instantly share code, notes, and snippets.

@pierrefevrier
Last active December 30, 2019 14:26
Show Gist options
  • Save pierrefevrier/56a28f06d6595f38f54387bd45a5622c to your computer and use it in GitHub Desktop.
Save pierrefevrier/56a28f06d6595f38f54387bd45a5622c to your computer and use it in GitHub Desktop.
Notes Architecting on AWS

Module 1: Core AWS Knowledge (p40)

AWS Global Infrastructure

  • AZ (Availabily Zone)
    • 1, n datacenters
    • 1 datacenter est propre à 1,1 AZ
    • 54 AZ dans le monde
    • GRPD=3 AZ minimum
  • 1 region = 2,n AZ
    • 18 regions dans le monde
    • AWS never moves your data out of the region you put it in.
  • Edge locations
    • support AWS services like Amazon Route 53 and Amazon CloudFront.
    • 96 dans le monde
  • Regional Edge cache
    • used by default with Amazon CloudFront, are utilized when you have content that is not accessed frequently enough to remain in an edge location.

Unmanaged vs Managed Services

  • Scaling, fault tolerance and availability are built in to the service

AWS Shared Responsability Model

  • Le client est responsable des données

Module 2: AWS Core Services (p57)

Amazon VPC (Virtual Private Cloud)

  • Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you have defined. This virtual network closely resembles a traditional network that you would operate in your own data center, with the benefits of using the scalable infrastructure of AWS. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways.
  • Amazon VPC provides two features that you can use to increase security for your VPC:
    • Security Groups: Act as a firewall for associated Amazon EC2 instances, controlling both inbound and outbound traffic at the instance level.
    • Network Access Control Lists (ACLs): Act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level.
  • IP ranges
  • Routing
  • Network gateway
  • Security settings
  • 1 VPC = 1 region
  • 1 VPC = 1,n AZ d'une même region

EC2 (Elastic Compute Cloud)

  • T2: Burstable performances
  • M3, M4, M5: General propose
  • C3, C4, C5: Compute optimized
  • H1, I3, D2: Storage and I/O optimized
  • X1, R3, R4: Memory optimized
  • P2, P3, G3, F1: GPU or FPGA enabled
  • 4 manières de payer:
    • On-Demand
      • Short-term, spiky, or unpredictable workloads
      • Application development or testing
    • Spot Instances
      • Le prix varie en fonction de l'heure
      • Applications with flexible start and end times
      • Hibernates EBS (Hibernate is just like closing and opening your laptop lid, with your application starting up right where it left off)
      • After a Spot Instance is hibernated by the Spot service, it can only be resumed by the Spot service.
      • The Spot service resumes the instance when capacity becomes available with a Spot price that is less than your specified maximum price.
    • Reserved Instances
      • Réservation d'1 à 3 ans (jusqu'à 75% de remise)
      • Predictable usage workload Les 3 modes précédents sont facturés à la seconde, minimum de 1 minute, pour Amazon Linux et Ubuntu, par heure pour les autres OS
    • Dedicated Hosts
      • Serveur physique dédié
      • Facturation à l'heure d'utilisation
      • BYOL (Bring Your Own Licence)
      • Compliance and regulatory restrictions
  • Maximum storage of Instance store-backed: 10Go, delete when instance terminates

Amazon Storage

S3 (Simple Storage Service)

  • 99,99% availability
  • 99.9999999999% durability
  • Object-level storage (=if you want to change a part of a file, you have to make the change and then re-upload the entire modified file.)
  • 5To max par objet
  • Acces via API, SDK, console AWS
  • Chaque bucket porte un nom unique dans tout S3
  • on peut choisir la region ou AWS stoque la bucket
  • https://s3-ap-northeast-1.amazonaws.com/[bucket name]/[file name]
  • Facturation pour: Transfer OUT or region, in GBs per month
  • Pas de facturation pour: Transfer IN to S3, Transfer OUT from S3 to CloudFront or the same region.
  • Options
    • General purpose: S3 standard
      • Higher availability requirements: use cross-region replication
    • Infrequently accessed data (30-day storage minimum): S3 standard-IA, S3 One Zone-IA
  • Data consistency model
    • Read after Write consistency
    • Read after Update not consistent
    • Read after Delete not consistent
    • List after Delete not consistent = read-after-write consistency for PUTS of new objets = eventual consistency for overwrite PUTS and DELETES
  • On peut versionner les objets
  • S3 lifecycle policies (delete or move objetcs based on age)

EBS (Elastic Block Store)

  • = SAN
  • Système de stockage des EC2 = disque dur réseau
  • Facturé à la seconde, minimum de 1 minute
  • Block level storage (voir schéma suivant pour bien comprendre le concept: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/92!/4/4@0.00:14.4)
  • Because they are directly attached to the instances, they can provide extremely low latency between where the data is stored and where it might be used on the instance. For this reason, they can be used to run a database with an Amazon EC2 instance.
  • Volumes are automatically replicated within its AZ.
  • Can be backed up automatically to S3
  • SSD
    • General Purpose
      • System boot volumes
    • Provisioned IOPS
      • Relational DBs
      • NoSQL DBs
  • HDD
    • Throughput-optimized
      • consistent, fast throughput at a low price
      • Cannot be a boot volume
    • Cold
      • Scenarios where the lowest storage cost is important
      • Cannot be a boot volume
    • You can mount multiple volumes on the same instance, but each volume can be attached to only one instance at a time.

EFS (Elastic File System)

  • = Remplace un NAS
  • Les données peuvent être chiffrées
  • File storage in the AWS cloud
  • Shared storage
  • NFS 4.0 et 4.1 (Network File System)
  • We recommend that you access the file system from a mount target within the same AZ: illustration
  • Mount Target
    • Subnet ID
    • Security groups
    • One or more per file system
    • Create in a VPC subnet
    • 1 per AZ
    • Must be in the same VPC

Glacier

  • data archiving service design for security, durability and extremely low cost
  • 3 options for retrieving data
    • Expedited retrievals are typically made available within 1 – 5 minutes.
    • Standard retrievals typically complete within 3 – 5 hours.
    • Bulk retrievals typically complete within 5 – 12 hours.

Amazon Database Review

RDS (Relational Database Service)

  • Managed service
  • Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, SQL Server
  • Master/ read replica / standby instances
  • Peut être backupé dans S3

DynamoDB

  • Fully managed NoSQL database service
  • Single-digit millisecond latency at any scale
  • If you're running simple GET/PUT requests on your data, consider using DynamoDB instead of a relational database.
  • Document and key-value store models supported
  • You specify your throughput capacity requirements (read/write) and DynamoDB allocates the resources you need.

IAM

  • Manage access and authentification of your users to your AWS resources.
  • Gratuit
  • Create users, groups and roles, and apply polices to them to control their access to AWS resources.
  • Manage what resources can be accessed and how they can be accessed (e.g., terminating EC2 instances).
  • Types of Security Credentials:
    • Email address and password: associated with your AWS account (root)
    • IAM user name and password: used for accessing the AWS Management Console
    • Access keys: typically used with CLI and programmatic requests like API and SDKs (=paire de clé publique/privée)
    • MFA (Multi-Factor Authentification: extra layer of security
  • IAM Permissions
    • Permissions determine which resource and which operations are allowed to be used
    • There is no defaut permissions
    • All permissions are implicitly denied by defaut
    • Any explicit deny takes precedence over an allow
    • Stoquée en JSON: les versionner est une bonne idée
  • IAM Policies
    • An IAM policy is a format statement of 1,n permissions
    • You attach a policy to any IAM entity: user, group, or role
    • Policies authorize the actions that may, or may not, be performed by the entity
    • Policies specify what actions are allowed, which resources to allow the actions on, and what the effect will be when the user requests access to the resources.
    • A single policy can be attached to multiple entitites
    • A singlet entity can have multiple policies attached to it
    • Best practice: when attacing the same policy to multiple IAM users, put the users in a group and attach the policy to rhe group instead
    • 1 policy=
  • IAM Users
    • Best practice: Create a separate IAM user account with administrative privileges for the root account user
    • IAM users are not necessarily people (peut-être une app par exemple)
  • IAM Groups
    • No default groups
    • Groups cannot be nested
    • 1 user = 0,n groups
    • Permissions are defined using IAM policies
  • IAM Roles
    • Used to delegate access to AWS
    • Provides temporary access
    • Eliminates the need for static AWS credentials
    • Permissions are:
      • Defined using IAM policies
      • Attached to the role, not to an IAM user or group
    • You create a role in the AWS account that contains the resources that you want to allow access to.
    • When you create the role, you specify two policies.
      • The trust policy specifies who is allowed to assume the role (the trusted entity, or principal).
      • The access (or permissions) policy defines what actions and resources the principal is allowed access to.
    • The simplest way to use roles is to grant your IAM users permissions to switch to roles that you create within your own or another AWS account.
    • They can switch roles easily using the IAM console to use permissions that you don't ordinarily want them to have, and then exit the role to surrender those permissions. This can help prevent accidental access to or modification of sensitive resources.

Module 3: Designing Your Environment (p154)

How do you choose a region ?

  1. Data sovereignty and compliance
  2. Proximity of users to data
  3. Service and feature availability
  4. Cost-effectiveness

How many AZ should you use ?

  • Recommendation: start with 2 AZ

Should you just fit everything into one VPC ?

  • For must use cases, there are 2 primary patterns for organizing your infrastructure:
    • Multi-VPC
      • Best suited for:
        • Single team or single organisations
        • Limited teams make maintaining standards and managing access far easier
    • Multi-Account
      • Best suited for:
        • Large organizations and organizations with multiple IT teams
        • Because managing access and standards can be more challenging in complex organizations

AWS Organizations

  • Hierarchical grouping of your accounts
  • Organization permissions overrule acccount permissions
  • Organisation is an entity that you create to consolidate your AWS accounts.
  • Organisation unit (OU) is a container for accounts within a root.
  • Account is a standard AWS account that contains your AWS resources.
  • Plus d'infos sur les autres concepts de AWS Organizations: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/172!/4/4@0.00:0.00

Other Important Considerations

  • The majority of AWS services do not actually sit within a VPC.
  • Network traffic between AWS Regions traverse the AWS global network backbone by default.
  • Sometimes traffic between regions uses the public internet.
  • S3 and DynamoDB offer VPC endpoints (powered by PrivateLink) to connect without traversing the publc internet.

How should you dividze your VPCs into subnets ?

  • CIDR (Classless Inter-Domain Routing) notation
  • VPCs can use CIDR ranges between /16 and /28
  • For every one step a CIDR range increases, the total number of IPs is cut in half
    • /16 65K IPs
    • /17 32K IPs
    • ...
    • /28 12 IPs
  • In every subnet, the first four and last one IP addresses are reserved for AWS use.
  • Public subnets
    • Include a routing table entry to an Internet gateway to support inbound/outbound access to the public internet.
  • Private subnets
    • Do not have a routing table entry to an Internet gateway and are not directly accessible from the public internet.
    • Typically use a "jump box" (NAT/proxy/bastion host) to support restricted, outbound-only public internet access.
      • Bastion: A Bastion host is a special purpose computer on a network specifically designed and configured to withstand attacks. The computer generally hosts a single application (such as a proxy server) and all other services are removed or limited to reduce the threat to the computer. It is hardened in this manner primarily due to its location and purpose, which is either on the outside of the firewall and usually involves access from untrusted networks or computers.
  • Rather than define your subnets based on applications or functional tier (web/app/data/ect), you should organize your subnets on internet accessibility.
  • Recommendation: start with one public and one private subnet per AZ.
  • Recommendation: allocate substantially more IPs for private subnets than for public subnets.
  • Exemple:
    • VPC: /21
    • Public subnet of each AZ: /24
    • Private subnet of each AZ: /23

Route tables: directing traffic between VPC resources

  • Determine where network traffic is routed
  • All route tables include a local route entry
    • Local route covers the entire VPC
    • The local route entry cannot be deleted
  • Only one route table per subnet
  • Multiple subnets can be associated with the same route table

Securing VPC Traffic with Security Groups

  • Virtual firewalls that control inbound and outbound traffic for one or more instances.
  • Deny all incoming traffic by default and use allow rules that can filter based on TCL, UDP and ICMP protocols (+ports).
    • You can specify allow rules, but not deny rules.
    • By default, no inbound traffic is allowed until you add inbound rules to the security group.
    • By default, all outbound traffic is allowed until you add outbound rules to the group. Then, you specify the outbound traffic that is allowed.
  • Are stateful, which means that if your inbound request is allowed, the outbound response is allowed automatically.
  • Can define a source/target as either a CIDR block or another security group to create layers of security.
  • Use security groups to control traffic into, out of, and between resources (example: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/191!/4/4@0.00:0.00).
  • Security group chaining diagram example: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/193!/4/2@100:0.00
  • Instances associated with a security group can't talk to each other unless you add rules allowing it.
  • After you launch an instance, you can change which security groups the instance is associated with.

Internet Gateway

  • Allow communication between instances in your VPC and the internet.
  • An internet gateway serves two purposes:
    • to provide a target in your subnet route tables for internet-routable traffic
    • to perform network address translation (NAT) for instances that have been assigned public IPv4 addresses.

What about Outbound traffic from private instances ?

  • NAT (Network Address Translation) services:
    • Enable instances in the private subnet to initiate outbound traffic to the internet or other AWS services.
    • Prevent private instances from receiving inbound traffic from the internet.
    • 2 primary options:
      • EC2 instance set up as a NAT in a public subnet
      • NAT Gateway (1 in each AZ)

How should you log your VPC traffic ?

  • Amazon VPC flow logs
    • Captures traffic flow details in your VPC
    • Can be enabled for VPCs, subnets, and ENIs
    • Logs published to CloudWatch logs

Can you connect multiple VPCs to each other ?

  • VPC peering connection: a one-to-one relationship between 2 VPCs.

How do you integrate on-premises components into your environment ?

  • VPN connections: AWS hardware VPN, you are provided with 2 VPN endpoints to provide basic, automatic failover.
  • AWS Direct connect: provides you with a private network connection between AWS and your data center.

What is a default VPC ?

  • Each region in your account has a default VPC.
  • Default CIDR is 172.31.0.0/16
  • Includes a default subnet, IGW, main route table connecting defaut subnet to the IGW, default security group, and default NACL

What is a default subnet ?

  • Created within each AZ for each default VPC.
  • Public subnet with a CIDR block of /20 (4 096 IPs).
  • Remove IGW to convert to private subnet.

VPC considerations and best practices

  • Choose CIDR blocks wisely. Plan ahead
  • Use large subnets instead of a higher number of small subnets
  • Keep subnets simple and divide by Internet accessibility (public/private)
  • Use multi-AZ deployments in VPC for high availability
  • Use security groups to control traffic between resources

Module 4: Making Your Environment Highly Available (p228)

What is High Availability ?

  • 90% = 36,5 days downtime per year (2,4h per day)
  • 99% = 3,65 days downtime per year (14m per day)
  • 99,9% = 8,76 hrs downtime per year (86s per day)
  • 99,99% = 52,6 min downtime per year (8,6s per day)
  • 99,999% = 5,25 min downtime per year (0,86s per day)
  • High Availability Factors:
    • Fault tolerance: the built-in redundancy of an application's components.
    • Scalability: The ability of an application to accommodate growth without changing design.
    • Recoverability: The process, policies, and procedures related to restoring service after a catastrophic event.
  • Elastic IP Addresses Provide Greater Fault Tolerance
    • Static IP addresses designed for dynamic cloud computing.
    • Can be attached to EC2 instances
    • Enable you to mask the failure of an instance of software by allowing your users and clients to use the same IP address with replacement resources.

ELB (Elastic Load Balancing)

  • A managed load balancing service that distributes incoming applicatio.traffic across multiple EC2 instances.
  • Distribuates load incoming application traffic across multiple targets, such as EC2 instances, containers and IP addresses.
  • Recognizes and responds to unhealthy instances.
  • Can be public or internal-facing.
  • Uses HTTP, HTTPS, TCP and SSL protocols.
  • Each load balancer is given a public DNS name
    • Internet-facing load balancers have DNS names wich publicly resolve to the public IP addresses of the load balancer's nodes.
    • Internal load balancers have DNS names wich publicly resolve to the private IP addresses of the load balancer's nodes.
  • ALB (Application Load Balancer)
    • Application layer (layer 7)
  • NLB (Network Load Balancer)
    • Connection layer (layer 4)
    • Distribuates traffic within the same AZ.
  • CLB (Classic Load Balancer)
    • Previous generation
  • HA
  • Health checks
  • TLS termination: integrated certificate management and SSL decryption
  • Connection draining: Enabling connection draining causes the load balancer to stop sending new requests to the back-end instances where instances are de-registering or become unhealthy.

Route 53

  • Authoritative DNS service (=respond to queries on port 53)
  • 100% SLA
  • What kinds of routing does route 53 support ?
    • Simple routing: single server environments
    • Weighted round robin: assign weights to resources record sets to specify the frequency
    • Latency-based routing: helps to improve your global applications
    • Health check and DNS failover: fail over to a backup site if your primary site becomes unreachable
    • Geolocation routing: specify geographic locations by continent, by country, or by state in the USA
    • Geoproximity routing with traffic biasing: route traffic based on the physical distance between your users and your resources.
    • Multivalue answers: ability to return multiple health-checkable IP addresses in response to DNS queries as a way to use DNS to improve availability and load balancing.

Module 5: Event-Driven Scaling (p295)

CloudWatch

  • Monitors your instances, and collectes and processes raw data into readable, near real-time metrics.
  • Sends notifications and triggers Auto Scalling actions based on metrics you specify.
  • Turns metrics into statistics, to be used by CloudWatch alarms.
  • Can monitor EC2 instances, DynamoDB tables, RDS instances, as well as custom metrics generated by your applications and services and any log file your application generates.
  • Alarms and actions
    • Stop, terminates, reboot, or recover an EC2 instance.
    • Scale an ASG in or out
    • Send message to SNS (Simple Notification Service)
  • You can change the log retention setting so that any log events older than this setting are automatically deleted.

Auto Scaling

  • Launch configuration (=What)
    • Name
    • AMI
    • Instance type
    • User data
    • Security groups
    • IAM role
    • Etc.
  • Auto Scaling group (=Where)
    • Name
    • Launch configuratio. name
    • Min and Max
    • AZ or subnet
    • Load Balancer
    • Desired capacity
    • Etc.
  • Auto scaling policy and Scheduled actions (=When)
  • Exemple: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/320!/4/4@0.00:9.08
  • ELB feature connection draining: This is a period of time that the ELB will stop sending request to the instance that has been identified for termination prior to de-registering it. Once the time has elapsed the ELB will forcefully close all open connection and terminate the targeted instance.
  • Auto scaling considerations
    • Avoid auto scaling thrashing
      • Avoid aggressive instance termination
      • Scale out early, scale in slowly
    • Use lifecycle hooks
      • Perform custom actions as Auto Scaling launches or terminates instances.

EC2 Auto Recovery

  • Replace impaired EC2 instances automatically
    • Conditions (detected by CloudWatch) that cause an instance to be impaired:
      • Loss of network connectivity
      • Loss of system power
      • Software/harware issues on host
    • Replacement instances:
      • Maintain same instance ID/metadata, IP addresses
      • Cannot use in-memory data from impaired instance (the data is lost)
    • Currently sipported instances types:
      • C3, C4, C5, M3, M4, M5, R3, R4, T2, and X1
      • Instances must be in a VPC and use shared tenancy
      • Instance storage cannot be used, you must use EBS-backed storage exclusively
      • Use instead of Auto Scaling when your hob needs to maintain identical instance metadata or storage volume.

Scaling data stores

  • Scale your storage up with a few clicks or via the API
    • Easy conversion from standard to Provisioned IOPS storage.
  • Offload read traffic to read replicas.
  • For increased performances, put a cache in front of RDS, such as:
    • Amazon ElastiCache for Memcached or Redis
    • Your preferred cache solution, self-managed in EC2.
  • Scaling RDS writes with database sharding: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/337!/4/4@0.00:0.00
  • Horizontal scaling with read replicats: RDS
    • Horizontally scale for read-heavy workloads
    • Replication is asynchronous
    • Currently available for Aurora, MySQL, MariaDB and PostrgreSQL (9.3.5+)
  • Scaling RDS: Push-button scaling
    • Scale nodes vertically up or down, ofter with no downtime
    • RDS for SQL Server does not currently support increasing storage or IOPS of an existing SQL Server DB instance.
  • Auto scaling for DynamoDB
    • Specify the desired target utilisation and provide upper and lower bounds for read and write capacity
    • DynamoDB monitors throughput consumption useng CloudWatch alarms
      • Then will adjust provisioned capacity up or down as needed

AWS Lambda and Event-Driven Scaling

  • Fully managed compute service that runs stateless code (Node.js, Java, Python, C# (.NET) Core, and Go) in response to an event or on time-based interval.
  • Run code without managing infrastructure like EC2 and Auto Scaling groups.
  • Scaling events can trigger AWS Lambda functions.

Module 6: Automating Your Infrastructure (p356)

CloudFormation

  • JSON or YAML template
  • Stack: a collection of resources created by CloudFormation
  • Cross stack references: share outputs from one stack with another stack (with ImportValue intrinsic function).
  • Usefull to separate AWS infrastructure into logical components grouped by stack (network, app, ...), a way to loosely couple stacks together as an alternative to Nested Stacks.
  • CloudFormation Designer
    • drag and drop resources onto a design area to automatically geneate a CloudFormation template.
    • Open and edit existing CloudFormation templates.

How should resources be grouped together into templates ?

Anatomy of a CloudFormation template

  • Description: describe the template
  • Metadata: provide additional details about the template
    • CloudFormation::init: Configuration tasks for the cfn-init helper script.
    • CloudFormation::interface Grouping and ordering of input parameters when they are displayed in the CloudFormation console.
    • CloudFormation::designer Describes how your resources are laid out in AWS CloudFormation Designer
  • Resources: Resources that will be included/created in the stack, such as an Amazon EC2 instance or an Amazon S3 bucket.
    • DependsOn is how you specify that CloudFormation should wait to launch a resource until a specific, different resource has already finished being created.
    • Wait condition: You pass an URL to applications or scripts that are running on your Amazon EC2 instances to send signals to that URL.
    • Creation policy
      • Default count=1
      • Default timeout period=5min (PT5M -> ISO 8601)
      • When the timeout period expires (or a failure signal is received), the resource creation fails and AWS CloudFormation rolls the stack back.
  • Parameters: Values you can pass in to your template at runtime.
    • Can specify allowed and default values for each parameter
  • Mappings: Mappings allow you to customize a resource's properties based on certain conditions.
    • For example, because an AMI ID is unique to a region, and the person who received your template may not necessarily know which AMI to use, you can provide the look-up list using the Mappings parameter.
  • Conditions: The optional Conditions section includes statements that define when a resource is created or when a property is defined. For example, you can compare whether a value is equal to another value. Based on the result of that condition, you can conditionally create resources.
  • Outputs: can specify the string output of any logical identifier available in the template. It's a convenient way to capture important information about your resources or input parameters.

Blue-Green Deployment on AWS Elastic Beanstalk

  • Automated infrastructure management and code deployment for your application.
    • Load balancing
    • Health monitoring
    • Auto scaling
    • Application platform management
    • Code deployment

OpsWorks

  • AWS OpsWorks is a configuration management service that helps you configure and operate applications in a cloud enterprise by using Puppet or Chef.

EC2 Run Command

  • Allows you to execute commands across multiple instances.
  • Offers visibility into the results (all of the commands are centrally logged to AWS CloudTrail for easy auditing).

Module 7: Decouping Your Infrastructure (p404)

SQS (Amazon Simple Queue Service)

SQS Queue Types

  • A message queue service used by distributed applications to exchange messages through a polling model, and can be used to decouple sending and receiving components—without requiring each component to be concurrently available.
  • Standard Queue
    • At-Least-Once Delivery
    • Best-Effort Ordering
  • FIFO Queue
    • Exactly-Once Processing
      • Duplicates are not introduced
    • Limited Throughput
      • Up to 300 send, receive, delete per second

SQS Benefits

  • Scalable
    • Potentially millions of messages
  • Reliable
    • All messages are stored redundantly on multiple servers and in multiple data centers.
  • Simultaneous read/write
  • Secure
    • API credentials are needed.
  • Amazon SQS offers a free tier: new and existing customers receive one million queuing requests for free each month.
  • Message Sample: https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/423!/4/4@0.00:0.00
  • Dead Letter Queues
    • Receives messages after a maximum number of processing attempts has been reached.

SNS (Simple Notification Service)

  • Allows applications to send time-critical messages to multiple subscribers through a “push” mechanism, eliminating the need to periodically check or “poll” for updates.
  • Enables you to set up, operate, and send notifications to subscribing services other applications.
    • Messages published to topic
    • Topic subscribers receive message
  • SNS Subscriber Types:
    • Email (plain or JSON)
    • HTTP/HTTPS
    • SMS
    • SQS queues
    • Mobile push messaging
    • Lambda Function
  • Characteristics of SNS:
    • Single published message
    • Order is not guaranteed or relevant
    • No recall
      • When a message is delivered successfully, there is no recall feature.
    • HTTP/HTTPS retry
    • 256 Ko max per message
  • Use case: Fan-out

Amazon MQ - Fully Managed, Low Cost

  • Manages the administration of ActiveMQ
    • Automatically provisioned for HA
    • Provides direct access to the ActiveMQ console
    • Support a variety of messaging technologies
      • JMS, NMS, AMQP, STOMP, MQTT and WebSocket
    • No need to rewrite your existing messaging code.
    • Message payloads (up to 32 Mo)
    • JMS local and distributed (XA) transaction support
    • 99,99999% of message durability
    • Queues and topics
    • Durable and non-durable subscriptions
    • push-based and poll-based messaging

Loose Coupling and DynamoDB

API Gateway

  • Allows you to create APIs that act as "front doors" for your applications to access data, business logic, or functionality from you back-end services.
  • Can handle workloads running on:
    • EC2
    • Lambda
    • Any web application
  • Features of API Gateway:
    • Host and use multiple versions and stages of your APIs
    • Create and distribuate API keys to developers
    • Leverage signature version 4 to authorize access to APIs
    • Throttle and monitor requests to protect your back end
    • Deeply integrated with AWS Lambda
  • Benefits of API Gateway:
    • Managed cache to store API responses
    • Reduced latency and DDoS protection through CloudFront
    • SDK generation for iOS, Android and JavaScript
    • OpenAPI Specification (Swagger) support
    • Request/response data transformation

Resource Sizing with Lambda

  • 46 differents levels of resource allocation, which range from:
    • 128Mo of memory and the lowest CPU power to
    • 3 Go of memory and the highest CPU power
  • Compute price scales with resource level
  • Functions can run between 100ms and 5 minutes in length

How to use Lambda

  1. Upload your code to Lambda (in .zip form)
  • You can also write code directly into an editor in the console or import it from an S3 bucket.
  1. Schedules function ? Specify how often it will run. Event-driven function ? Identify the event source.
  2. Specify its necessary compute ressources (from 128 Mo to 3 Go of memory)
  3. Specify its timeout period
  4. Specify the VPC whose resources it need to access (if applicable)
  5. Launch the function

Module 8: Designing Web-Scale Storage (p467)

How should web-accessible content be stored ?

  1. Store static assets in S3

How do I get the Most out of S3 ?

  • Pay attention to your object naming scheme if
    • You want consistent performance from a bucket
    • You want a backet capable of routinely exceeding 100 PUT/LIST/DELETE or 300 GET requests per second.
  • Naming Buckets and Keys (.S3.amazonaws.com)
    • Use 2 to 63 caracters
    • Use only lowercase letters, numbers, periods (.), and hyphens (-)
    • Don't start or end the bucket name with a hypen, and don't follow or precede a period with a hyphen.

What if your bucket is constantly under load ?

  • S3 automatically partitions your buckets according to the prefixes of your files.
    • If you store thousands of objects in an S3 bucket, adding a random value to the key can provide higher performances.
      • Anti-pattern
        • /album1/photo1.jpg
        • /album1/photo2.jpg
        • /album2/photo1.jpg
        • /album2/photo2.jpg
      • Best practice (add a hew hash prefix to the key)
        • /2e4f/album2/photo1.jpg
        • /3a79/album1/photo2.jpg
        • /7b54/album1/photo1.jpg
        • /8761/album2/photo2.jpg
    • To more easily retrieve your files in useful ways, maintain a secondary index and hash all key names (via DynamoDB table for example): https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/476!/4/4@0.00:0.00
  1. Serve frequently accessed assets from CloudFront
  • CloudFront reduces both latency AND cost, since in the requests after the first one, you're no longer paying for the file to be transferred out of Amazon S3.

  • CloudFront is a CDN

    • CloudFront delivers your content through a worldwide network of data centers called edge locations.
    • Amazon CloudFront has added a new type of edge location called Regional Edge Cache that further improves performance for your viewers.
    • Regional Edge Caches are turned on by default for your CloudFront distributions; you do not need to make any changes to your distributions to take advantage of this feature. There are also no additional charges to use this feature.
  • How do you enable CloudFront ?

    • Use a separate CNAME for static content.
    • Point entire URL to CloudFront.
  • Expiration Period

    • Set expiration period by setting the cache control headers on your files in your origin (If-Modified-Since).
    • By default, if no cache control header is set, each edge location checks for an updated version of your file whenever it receives a request more than 24 hours after the previous time it checked the origin for changes to that file.
    • Change object name
      • Header-v1.jpg becomes Header-v2.jpg
  1. Store non-relational data in a NoSQL database such as DynamoDB
  • NoSQL Databases
    • On EC2
      • Cassandra, HBase, Redis, MongoDB, Couchbase, and Riak
    • Managed
      • Amazon DynamoDB, Amazon Neptune, ElastiCache with Redis, Amazon EMR supports HBase
  • Shifting Functionality to NoSQL
    • Use cases:
      • Leaderboards and scoring
      • Rapid ingest of clickstream or log data
      • Temporary data needs (cart data)
      • Hot tables
      • Metadata or lookup tables
      • Session data
  • Key Characteristics of DynamoDB
    • Low latency
      • SSD-based storage nodes
      • Latency = single-digit milliseconds
    • Massive and seamless scalability
      • No table size of thoughput limits
      • Live repartitioning for changes to storage and throughput
    • Predictable performance
      • Provisioned throughput model
    • Durable and available
      • Consistent, disk-only writes
      • On-demand backup
    • Businesses can also use Amazon EMR to access data in multiple stores (DynamoDB, Amazon RDS, and Amazon S3), do complex analysis over this combined dataset, and store the results of this work in Amazon S3. Amazon EMR also supports direct interaction with DynamoDB using Hive.
  • DynamoDB Data Model (https://evantage.gilmoreglobal.com/#/books/100-ARCHIT-55-EN-SG-E/cfi/498!/4/4@0.00:0.00)
    • Data partitioned by the partition (hash) key (example: User Id)
    • Sort (range) keys (example: Order Id)
    • DynamoDB provides flexible querying by allowing queries on non-primary key attributes using Global Secondary Indexes and Local Secondary Indexes.
    • A primary key can be a single-attribute partition key or a composite partition-sort key. A composite partition-sort key is indexed as a partition key element and sort (also known as range) key element.
    • Item size cannot exceed 400 Ko
  • DynamoDB Consistency
    • DynamoDB stores 3 geographicallu distributed replicas of each table.
    • You can specify, at the time of the read request, whether a read should be eventually or strongly consistent.
  • Global Tables
    • A global table is a collection of one or more DynamoDB tables, all owned by a single AWS account, identified as replica tables.
    • A replica table (or replica, for short) is a single DynamoDB table that functions as a part of a global table.
    • Each replica stores the same set of data items.
    • Any given global table can only have one replica table per region, and every replica has the same table name and the same primary key schema.
    • In a global table, a newly written item, such as the one you see on this slide, is usually propagated to all replica tables within seconds.
    • An application can read and write data to any replica table.
    • If your application only uses eventually consistent reads, and only issues reads against one AWS region, then it will work without any modification.
    • However, if your application requires strongly consistent reads, then it must perform all of its strongly consistent reads and writes in the same region.
    • DynamoDB does not support strongly consistent reads across AWS regions
    • Conflicts can arise if applications update the same item in different regions at about the same time. To ensure eventual consistency, DynamoDB global tables use a “last writer wins” reconciliation between concurrent updates, where DynamoDB makes a best effort to determine the last writer.
  • DynamoDB Best Practices
    • Keep item size small.
    • Store metadata in DynamoDB and large BLOBs in S3.
    • Use table per day, week, month, ect, for storing time series data.
    • Use conditional or Optimistic Concurrency Control updates.
    • Avoid hot keys and hot partitions.
  1. Store relational data in RDS.
  • RDS Security

    • Control RDS DB instance access via security groups.
      • You can control Amazon RDS DB instance access via DB security groups, which are similar to Amazon EC2 security groups but not interchangeable.
      • Database security groups default to a “deny all” access mode, and customers must specifically authorize network ingress.
      • There are two ways to do this: authorize a network IP range, or authorize an existing Amazon EC2 security group. DB security groups only allow access to the database server port (all others are blocked).
    • Encrypted instances are currently availalble for all database engines supported in RDS.
      • This encryption is available for Amazon RDS with MySQL, PostgreSQL, Oracle, and SQL Server DB instances.
    • Use IAM to control wich RDS operation each individual user hase permission to call.
    • Encrypt connections between your application and your DB instance using SSL/TLS.
    • Transparent Data Encryption (TDE) is supported for SQL Server and Oracle.
    • You can receive notifications of important events that can occur on your RDS instance.
  • Amazon Aurora Overview

    • Integrated with S3 for continuous backup.
      • 6 copies replicated across 3 AZ
    • Moved logging and storage layer into multi-tenant, scalable service layer.
    • SSD data plane up to 64To max DB volume.
    • Resilient Design
      • 99,99% available
      • Instant crash recovery
      • Read replicas designed for instant promotion
        • Up to 15 replicas.
        • ~10ms replica lag (compared to seconds or minutes with MySQL)
  • Full Example: https://gist.github.com/pierrefevrier/56a28f06d6595f38f54387bd45a5622c/edit

Module 9: Is Your Infrastructure Well-Architected ? (p537)

Pillars of the Well-Architected Framework

  • Operational Excellence
    • Deliver business value
  • Security
    • Protect and monitor systems
  • Reliability
    • Recover from failure and mitigate disruption
  • Performance Efficiency
    • Use resources sparingly.
  • Cost Optimization
    • Eliminate unneeded expense.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment