Example of scaling EC2 instance used for serving http requests. Most applications don't have a perfectly constant load. Must provision for peak load, paying for it when idle. Solution would be to add EC2 Auto Scaling Group.
Must also make it highly available by spreading across availability zones.
Things I have to manage - application and code, infrastructure. Infrastructure best practices are often the same across customers.
Requests -> Business Logic -> Backend
More abstract: Event -> Handler -> Backend
Here Handler is the AWS Lambda Function in the serverless echo system.
Building and Deploying Lambda Functions
- Write your code (zipped)
- Deploy to AWS Lambda
- Your code is run in response to events
Serverless means
- No Server Management
- Flexible Scaling
- Automated High Availability
- No Idle Capacity
Serverless amplifies cloud benefits Physical servers in data centers -> Virtual servers in cloud -> Serverless
Cost savings with serverless
- Infrastructure
- Operations
- Reduced time to market
Monolith - responsible for code, databases Serverless - assign various concerns to services and connect together with Lambda functions
When you are designing your architecture, don't focus on the question "What's the data that I'm storing and what operations do I need to perform against that?" Instead, ask "What are the events that should trigger an action my system?"
Parallelization decreases time to complete pywren is an open source project that allows you to do extremely high throughput computing jobs using Lambda as the compute engine behind the scenes. This gives you the ability to write a very simple python script that runs on your local machine that then delegates out to Lambda functions in order to massively scale out and parallelize the jobs that you're doing.
Can create environments for developers and branches
Automate CI/CD for repeatable deployments Use application frameworks like AWS Serverless Application Model
Benefits of Lambda
- Run code without provisioning or maintaining servers
- It initiates functions for you in response to events
- It scales automatically
- It provides built-in code monitoring and logging via Amazon CloudWatch
Features
- Bring your own code
- Integrates with and extends other AWS services
- Flexible resource and concurrency model
- Flexible permissions model
- Availability and fault tolerance are built in
- Pay for value
Event-driven architectures
- Initiate actions and communication between decoupled services
- Event - change in state, user request, or an update
- When an event occurs, the info is published for consumers
- Events are observable (ex: new message in a log file), rather than directed (ex: command to do something)
Producers - create the events. Only aware of the event router, not the consumer. Router - ingests, filters, and pushes events to consumers (via SNS) Consumers - subscribe to receive notification or monitor an event stream and act on events that pertain to them
Lambda Invocation Types
- Synchronous - Lambda runs and requester waits for response; when finished, returns a response; no built-in retries
- Asynchronous - Events are queued; requester doesn't wait. Can send records of async invocations to destinations (ex: SQS); retries twice.
- Polling - Lambda will watch services (like queues, SQS, Kinesis), retrieve matching events, and invoke functions; retries depend on event source
Execution environment lifecycle
- INIT phase
- Extension init
- Runtime init
- Function init
- INVOKE phase - invokes the handler; after completion, prepares to handle another invocation
- SHUTDOWN phase
- RUNTIME SHUTODWN
- EXTENSION SHUTDOWN
Cold & warm starts Cold - start environment and download code; initialize runtime; initialize packages and dependencies Warm - invoke code (if Lambda was recently invoked)
Additional latency in cold start. Shouldn't usually be an issue, but in some cases it might be. Can use provisioned concurrency to prepare concurrent execution environments before invocations.
Best practice: Write functions to take advantage of warm starts
- Store and reference dependencies locally
- Limit re-initialization of variables
- Add code to check for and reuse existing connections
- Use tmp space as transient cache
- Check that background processes have completed
IAM Resource Policy - permissions to invoke the functions IAM Execution Role - controls what the function can do; The role must include a trust policy that allows Lambda to “AssumeRole” so that it can take that action for another service
Start with the handler method. Lambda runs until handler exits or returns a response. Handler takes an event object and a context object.
Event - required; differs based on event that created it Context - optional; allows code to interact with execution environment (ex: logging)
Design best practices
- Keep business logic separate from handler method
- Make functions modular
- Each function should be stateless
Best practices for writing code
- Include logging statements
- Use return coding
- Provide environment variables
- Add secret and reference data
- Avoid recursive code
- Gather metrics with Amazon CloudWatch
- Reuse execution context
Building Lambda functions
- Lambda console editor
- Deployment packages
- Automate using tools
AWS SAM is a part of AWS CloudFormation
Memory
- Can allocation up to 10 GB
- CPU and other resources scale linearly with memory
- Use AWS Lambda Power Tuning tool
Timeout
- Max timeout is 900 seconds (15 minutes)
Billing costs
- During is rounded up to nearest 1 ms
- Price depends on memory allocation (not memory used)
Concurrency - number of AWS Lambda function invocations running at a single time
Concurrency types
- Unreserved concurrency - not allocated to any specific set of functions (min 100)
- Reserved concurrency - max number of concurrent instances (no other functions can use that concurrency)
- Provisioned concurrency - initializes a requested number of runtime environments; used when needing high performance and low latency
Regional quota - 1,000 instances across region
Burst - when there is a sudden need to increase instances. Burst quota varies by region.
Reduce risk using versions and aliases
Version can be referenced at end of ARN. Use $LATEST for latest version. When using "publish" it makes an immutable snapshot of $LATEST and create a new version number (if versioning enabled).
You can also use an alias to point to a function version.
You can point an alias to two function versions. This is useful when deploying a new version and having a small amount of traffic go to the new version before completing the deployment.
CodeDeploy supports canary, linear, and all-at-once deployment patterns. Also supports alarms which, when triggered, rollback the deployment. Hooks are also supported, which can run pre and post checks before / after traffic shifting.
Monitoring is automatic and metrics sent to CloudWatch.
Can use X-Ray - Lambda sends trace data to it.
Event-driven architectures
- State and code are decoupled
- Integration via messaging
- Asynchronous connections
Use SQS to set up a message queue in front of Lambda. Lambda polls the queue and process messages in batches. If a processing fails, then the message is made visible again on the queue. Lambda functions need to be able to handle partial failures so that the entire batch is not visible again - it should delete messages from the queue on success.
Key considerations
- You configure the queue but Lambda manages the polling processes
- Standard queues have much higher throughput than FIFO but order is not gauranteed
- You need to write idempotent functions to handle the potential for duplicate messages
Could just chain Lambda functions, but here are problems
- Error and retry processing in each function
- Each function must be aware of all the steps in the chain
- Must include rollback logic across functions
Orchestrate with Step Functions
- Keeps orchestration out of your code
- Automatically triggers and tracks each step
- Logs the state of each step
Tasks: Perform work in Step Functions using activities or Lambda functions, or by passing parameters to API actions of other services
Activities: Applications that you write and host on AWS, premises, or mobile devices
Activity workers: Execute application code and report success or failure
Messaging patterns:
- Sequential tasks
- Conditional choice
- Looping tasks
- Try/catch/finally
- Parallel
Key considerations
- Orchestration different types of backend processes
- Use wait states while waiting for resources
- Use callback tasks
- Also includes an increasing number of direct service integrations
Client polls a status endpoint to get the status. If it's complete, it can then call a getResults endpoint for the results. Adds additional latency and is wasteful.
Webhook: User-definted HTTP callback
- Trusted - you own both sides and create a secure integration
- Untrusted - webhook established through a registration process
Client configures webhook and gets request ID from API gateway Backing service continues asynchronously Backing service sends updated status via the webhook
More complex polling
- Client must host endpoint
- Client needs to permit external requests
- Need explicit agreement on retry policies
- Lambda functions are backend is responsible for retries with untrusted client
WebSocket APIs are an open standard used to create a persistent connection between the client and the backing service, permitting bidirectional communication.
AppSync - fully managed GraphQL service
Clients can auto subscribe and get status updates. Ideal for streaming or more than single response.
Directly integrate with Step Functions
Kinesis Data Streams, Firehose, and Analytics
Instead of streaming, can use messaging
SNS message filtering; compares message attributes and subscriber gets only filtered messages. So you don't need as many topics.
Serverless application repository Event fork pipelines
A serverless event bus
Messaging
- Core entity is an individual message and message rates vary
- Messages are deleted once they've been consumed
- Configure retries and dead-letter queues for failures
Streaming
- You look at the stream of messages together and the stream is generally continuous
- Data remains on the stream for a period of time; consumers must maintain a pointer
- Message is retried until it succeeds or expires; you must build error handling into your function to bypass a record
Failure management in your functions
- CloudWatch logs and alarms
- Retry and backoff mechanisms using AWS SDK
Function errors vs invocation errors Synchronous vs asynchronous errors
On failure destination (SNS, SQS) vs dead letter queue
- Additional data
- More flexible
Best practice - set visibility timeout on SQS queue to 6x the Lambda function timeout
Can use Step Functions for try/catch/finally, retry, looping fields for error handling
SAGA pattern
X-Ray