wesbarnett/notes.md

## notes.md

      
    Raw
  

              notes.md
            
          
    Getting into the Serverless Mindset

Example of scaling EC2 instance used for serving http requests. Most applications don't have a perfectly constant load. Must provision for peak load, paying for it when idle. Solution would be to add EC2 Auto Scaling Group.
Must also make it highly available by spreading across availability zones.
Things I have to manage - application and code, infrastructure. Infrastructure best practices are often the same across customers.
Requests -> Business Logic -> Backend
More abstract:
Event -> Handler -> Backend
Here Handler is the AWS Lambda Function in the serverless echo system.
Building and Deploying Lambda Functions

Write your code (zipped)
Deploy to AWS Lambda
Your code is run in response to events

Serverless means

No Server Management
Flexible Scaling
Automated High Availability
No Idle Capacity

Serverless amplifies cloud benefits
Physical servers in data centers -> Virtual servers in cloud -> Serverless
Cost savings with serverless

Infrastructure
Operations
Reduced time to market

Monolith - responsible for code, databases
Serverless - assign various concerns to services and connect together with Lambda functions
When you are designing your architecture, don't focus on the question "What's the data that I'm storing and what operations do I need to perform against that?" Instead, ask "What are the events that should trigger an action my system?"
Parallelization decreases time to complete
pywren is an open source project that allows you to do extremely high throughput computing jobs using Lambda as the compute engine behind the scenes. This gives you the ability to write a very simple python script that runs on your local machine that then delegates out to Lambda functions in order to massively scale out and parallelize the jobs that you're doing.
Can create environments for developers and branches
Automate CI/CD for repeatable deployments
Use application frameworks like AWS Serverless Application Model
AWS Lambda Foundations

Benefits of Lambda

Run code without provisioning or maintaining servers
It initiates functions for you in response to events
It scales automatically
It provides built-in code monitoring and logging via Amazon CloudWatch

Features

Bring your own code
Integrates with and extends other AWS services
Flexible resource and concurrency model
Flexible permissions model
Availability and fault tolerance are built in
Pay for value

Event-driven architectures

Initiate actions and communication between decoupled services
Event - change in state, user request, or an update
When an event occurs, the info is published for consumers
Events are observable (ex: new message in a log file), rather than directed (ex: command to do something)

Producers - create the events. Only aware of the event router, not the consumer.
Router - ingests, filters, and pushes events to consumers (via SNS)
Consumers - subscribe to receive notification or monitor an event stream and act on events that pertain to them
How AWS Lambda Works

Lambda Invocation Types

Synchronous - Lambda runs and requester waits for response; when finished, returns a response; no built-in retries
Asynchronous - Events are queued; requester doesn't wait. Can send records of async invocations to destinations (ex: SQS); retries twice.
Polling - Lambda will watch services (like queues, SQS, Kinesis), retrieve matching events, and invoke functions; retries depend on event source

Execution environment lifecycle

INIT phase


Extension init
Runtime init
Function init


INVOKE phase - invokes the handler; after completion, prepares to handle another invocation
SHUTDOWN phase


RUNTIME SHUTODWN
EXTENSION SHUTDOWN

Cold & warm starts
Cold - start environment and download code; initialize runtime; initialize packages and dependencies
Warm - invoke code (if Lambda was recently invoked)
Additional latency in cold start. Shouldn't usually be an issue, but in some cases it might be. Can use provisioned concurrency to prepare concurrent execution environments before invocations.
Best practice: Write functions to take advantage of warm starts

Store and reference dependencies locally
Limit re-initialization of variables
Add code to check for and reuse existing connections
Use tmp space as transient cache
Check that background processes have completed

AWS Lambda Function Permissions

IAM Resource Policy - permissions to invoke the functions
IAM Execution Role - controls what the function can do; The role must include a trust policy that allows Lambda to “AssumeRole” so that it can take that action for another service
Authoring AWS Lambda Functions

Start with the handler method. Lambda runs until handler exits or returns a response. Handler takes an event object and a context object.
Event - required; differs based on event that created it
Context - optional; allows code to interact with execution environment (ex: logging)
Design best practices

Keep business logic separate from handler method
Make functions modular
Each function should be stateless

Best practices for writing code

Include logging statements
Use return coding
Provide environment variables
Add secret and reference data
Avoid recursive code
Gather metrics with Amazon CloudWatch
Reuse execution context

Building Lambda functions

Lambda console editor
Deployment packages
Automate using tools

AWS SAM is a part of AWS CloudFormation
Configuring Your Lambda Functions

Memory

Can allocation up to 10 GB
CPU and other resources scale linearly with memory
Use AWS Lambda Power Tuning tool

Timeout

Max timeout is 900 seconds (15 minutes)

Billing costs

During is rounded up to nearest 1 ms
Price depends on memory allocation (not memory used)

Concurrency and scaling

Concurrency - number of AWS Lambda function invocations running at a single time
Concurrency types

Unreserved concurrency - not allocated to any specific set of functions (min 100)
Reserved concurrency - max number of concurrent instances (no other functions can use that concurrency)
Provisioned concurrency - initializes a requested number of runtime environments; used when needing high performance and low latency

Regional quota - 1,000 instances across region
Burst - when there is a sudden need to increase instances. Burst quota varies by region.
Deployment

Reduce risk using versions and aliases
Version can be referenced at end of ARN. Use $LATEST for latest version. When using "publish" it makes an immutable snapshot of $LATEST and create a new version number (if versioning enabled).
You can also use an alias to point to a function version.
You can point an alias to two function versions. This is useful when deploying a new version and having a small amount of traffic go to the new version before completing the deployment.
CodeDeploy supports canary, linear, and all-at-once deployment patterns. Also supports alarms which, when triggered, rollback the deployment. Hooks are also supported, which can run pre and post checks before / after traffic shifting.
Monitoring and Troubleshooting

Monitoring is automatic and metrics sent to CloudWatch.
Can use X-Ray - Lambda sends trace data to it.
Designing Event-Driven Architectures

Event-driven architectures

State and code are decoupled
Integration via messaging
Asynchronous connections

Serverless Event Submission Patterns

Amazon SQS integration with Lambda

Use SQS to set up a message queue in front of Lambda. Lambda polls the queue and process messages in batches. If a processing fails, then the message is made visible again on the queue. Lambda functions need to be able to handle partial failures so that the entire batch is not visible again - it should delete messages from the queue on success.
Key considerations

You configure the queue but Lambda manages the polling processes
Standard queues have much higher throughput than FIFO but order is not gauranteed
You need to write idempotent functions to handle the potential for duplicate messages

Workflow orchestration with AWS Step Functions

Could just chain Lambda functions, but here are problems

Error and retry processing in each function
Each function must be aware of all the steps in the chain
Must include rollback logic across functions

Orchestrate with Step Functions

Keeps orchestration out of your code
Automatically triggers and tracks each step
Logs the state of each step

Tasks: Perform work in Step Functions using activities or Lambda functions, or by passing parameters to API actions of other services
Activities: Applications that you write and host on AWS, premises, or mobile devices
Activity workers: Execute application code and report success or failure
Messaging patterns:

Sequential tasks
Conditional choice
Looping tasks
Try/catch/finally
Parallel

Key considerations

Orchestration different types of backend processes
Use wait states while waiting for resources
Use callback tasks
Also includes an increasing number of direct service integrations

Patterns for Communicating Status Updates

Client polling pattern

Client polls a status endpoint to get the status. If it's complete, it can then call a getResults endpoint for the results. Adds additional latency and is wasteful.
Webhook pattern with Amazon SNS

Webhook: User-definted HTTP callback

Trusted - you own both sides and create a secure integration
Untrusted - webhook established through a registration process

Client configures webhook and gets request ID from API gateway
Backing service continues asynchronously
Backing service sends updated status via the webhook
More complex polling

Client must host endpoint
Client needs to permit external requests
Need explicit agreement on retry policies
Lambda functions are backend is responsible for retries with untrusted client

WebSockets pattern with AWS AppSync

WebSocket APIs are an open standard used to create a persistent connection between the client and the backing service, permitting bidirectional communication.
AppSync - fully managed GraphQL service
Clients can auto subscribe and get status updates. Ideal for streaming or more than single response.
Directly integrate with Step Functions
Serverless Data Processing Patterns

Kinesis Data Streams, Firehose, and Analytics
Instead of streaming, can use messaging
SNS message filtering; compares message attributes and subscriber gets only filtered messages. So you don't need as many topics.
Serverless application repository
Event fork pipelines
Amazon EventBridge

A serverless event bus
Streaming vs. Messaging

Messaging

Core entity is an individual message and message rates vary
Messages are deleted once they've been consumed
Configure retries and dead-letter queues for failures

Streaming

You look at the stream of messages together and the stream is generally continuous
Data remains on the stream for a period of time; consumers must maintain a pointer
Message is retried until it succeeds or expires; you must build error handling into your function to bypass a record

Failure Management

Failure management in your functions

CloudWatch logs and alarms
Retry and backoff mechanisms using AWS SDK

Function errors vs invocation errors
Synchronous vs asynchronous errors
On failure destination (SNS, SQS) vs dead letter queue

Additional data
More flexible

Best practice - set visibility timeout on SQS queue to 6x the Lambda function timeout
Can use Step Functions for try/catch/finally, retry, looping fields for error handling
SAGA pattern
X-Ray