Skip to content

Instantly share code, notes, and snippets.

@naftulikay naftulikay/lambda-fucking-lambda.md Secret
Last active Mar 12, 2019

Embed
What would you like to do?

Lambda, Fucking Lambda

God damn it getting it right.

Features

Here are the features we need to support:

  • API Gateway HTTP Events: we shouldn't be doing any manual routing using regular expressions or other bullshit. Probably the easiest thing to do is to run an actual Rust web server and use reqwest to query it, but this has the overhead of loopback; maybe let's use Unix sockets?
  • API Gateway Auth Events: straightforward.
  • CloudWatch Events: cronlike jobs. We should be able to execute n of these in parallel using some kind of macro compile-time shit. Each of these should be an async function so we can execute them all in parallel.
  • SNS Events: notification of various types: S3, SES, and other events.
  • SES Events: email filtering.

Strategy

Let's lay out strategy for various use-cases.

API Gateway HTTP Events

It might be possible for us to use Tide to route things internally without hitting the network. The only overhead would be the initialization step of binding, which can be done in a Once block. Tide seems to be the only framework that currently offers this.

Bind Tide

First, we setup and bind Tide:

let mut app = tide::App::new();

app.at("/hello/world").get(get_hello_world);
app.at("/hello/world").post(post_hello_world);
app.at("/hello/world").delete(delete_hello_world);

app.serve()

app.serve() never returns, it blocks endlessly.

(UNANSWERED) If we don't call serve, can we submit requests anyway?

Setup Lambda Handler

Next, we define our Lambda using the lambda! macro.

fn main() {
    lambda!(entrypoint)
}

fn entrypoint(_data: Vec<u8>, _context: Context) -> Result<Vec<u8>, HandlerError> {
    // serve request
}

The lambda! macro calls a chain of shit:

  1. lambda! itself, which calls
  2. start, which calls
  3. start_with_config, which calls
  4. start_with_runtime_client, which blocks forever.

A couple threads can be used to schedule both of these conflicting and eternal-blocking methods, then can join on both of them to wait for everything to terminate.

Send Requests to Tide

Now, we can send these requests into Tide:

fn handler(data: Vec<u8>, _context: Context) -> Result<Vec<u8>, HandlerError> {
    let service = app.into_http_service();
    let future = service::respond(&mut (), Request::from(data);
    let result = await!(future);
    
    // do something with it
}

We can finally use HttpService::respond to send a request without going over the network. We only need to convert our API Gateway data into a http::Request and an http::Response.

(UNANSWERED) Can we just fuck around with futures arbitrarily in any context since Tide is running? Will Lambda get the same Tokio reactor and be happy?

I think that the answer to this is:

  1. We spin up the reactor ourselves.
  2. We pass that reactor to lambda!.
  3. Tide is an abstraction layer on Hyper.
  4. When Hyper is setup, it takes an executor; the default is tokio::spawn
  5. Hyper calls tokio::spawn whenever it needs to handle a connection or something.
  6. Tokio tracks the default executor using a thread-local storage variable.

And with this, we have a fully functional asynchronous web server without network overhead and all the bells and whistles that this entails.

API Gateway Auth Events

Auth logic is going to be either very simple (authentication only) or very complex (authorization too). We might split the bill on this:

  • APIs can either be private or public, and any unauthenticated clients cannot access private APIs.
  • More complicated authorization is delegated to the functions themselves.
  • We can use Tide middleware on the actual functions to prevent unauthorized access.

CloudWatch Events

No answer for this. We have CloudWatch cron jobs that kick off hourly, daily, weekly, and monthly. It would be nice to get some kind of compile-time parsing of cron time definition and then use this to "schedule" jobs. If something needs to run every three hours, we run it hourly modulo 3.

Since there may be multiple cron jobs kicking off at the same time, each should be executed asynchronously with a join on all of their handles.

SNS Events

There can be many in one payload. We have no routing logic for these, so we might have to roll manual. Since there can be many, we should asynchronously execute all of them and join on all their handles as above.

SES Events

No good routing logic for this, but similar pattern as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.