Skip to content

Instantly share code, notes, and snippets.

@Rich-Harris
Last active November 3, 2022 09:02
Show Gist options
  • Save Rich-Harris/4c061058176bb7f914d229d5c2a5d8ce to your computer and use it in GitHub Desktop.
Save Rich-Harris/4c061058176bb7f914d229d5c2a5d8ce to your computer and use it in GitHub Desktop.
Next-gen Node HTTP APIs

I saw this poll on Twitter earlier and was surprised at the result, which at the time of writing overwhelmingly favours option 1:

Screen Shot 2020-09-10 at 10 19 22 AM

I've always been of the opinion that the (req, res, next) => {} API is the worst of all possible worlds, so one of two things is happening:

  • I'm an idiot with bad opinions (very possibly!)
  • People like familiarity

Why I dislike option 1

It's bad for composition in two ways. Firstly, for all but the most trivial handlers, you have to awkwardly pass the res object around:

app.use((req, res, next) => {
  if (some_condition_is_met(req)) {
    return render_view(req, res);
  }
  
  next();
})

Secondly, combining middleware often involves monkey-patching the res object:

function log(req, res, next) {
  const { writeHead } = res;
  const start = Date.now();
  
  let details;
  
  res.writeHead = (status, message, headers) => {
    if (!headers && typeof message !== 'string') {
      headers = message;
      message = '';
    }
    
    details = { status, headers };
    
    writeHead.call(res, status, message, headers);
  };
  
  res.on('finish', () => {
    console.log(`${req.method} ${req.url} (${Date.now() - start}ms): ${details.status} ${JSON.stringify(details.headers)}`);
  });
  
  next();
}

app.use(log).use((req, res, next) => {...});

Monkey-patching objects belonging to the standard library is no more advisable here than it was in MooTools, but it's endemic in the ecosystem around Node servers.

In addition, because the built-in res object makes it difficult to do something as straightforward as responding with some JSON, and because send(res, data) is awkward, Express apps use a superclass of http.ServerResponse that adds a res.send(data) method among other things. I'm really not a fan of this pattern. Libraries shouldn't (but seemingly sometimes do) assume that these extra methods exist, making it harder to combine logic from different places.

An alternative

Passing the res object around is reminiscent of a pattern that used to be extremely prevalent in Node apps:

function do_something(foo, cb) {
  if (!is_valid(foo)) {
    return cb(new Error('Invalid foo!'));
  }
  
  do_something_with_validated_foo(foo, cb);
}

do_something({...}, (err, result) => {
  if (err) throw err;
  console.log(result);
});

Nowadays the ergonomomics around asynchronicity are much better thanks to async/await, which means we can use a more natural approach: the return keyword.

function do_something(foo) {
  if (!is_valid(foo)) {
    throw new Error('Invalid foo!');
  }
  
  return do_something_with_valid_foo(foo);
}

console.log(await do_something({...}));

Conceptually, a response to an HTTP request is basically the same thing as a value returned from a function. So why don't we model it as such?

app.use(req => {
  if (some_condition_is_met(req)) {
    return render_view(req);
  }
  
  // no returned object, implicit next()
});

Here, the returned value could be a new Response(...) where Response is some built-in object, or it could be something more straightforward:

{
  status: 200,
  headers: {
    'Content-Type': 'text/html',
    'Content-Length': 21
  },
  body: '<h1>Hello world!</h1>'
}

(body could also be a Buffer or a Stream or a Promise, perhaps.)

This pattern lends itself to composition:

const respond = (body, headers, status = 200) => ({
  body,
  status,
  headers: Object.assign({ 'Content-Length': body.length }, headers)
});

const json = obj => respond(JSON.stringify(obj), {
  'Content-Type': 'application/json'
});

app.use(() => json({
  answer: 42
}));

Middlewares can be composed the way Koa does it:

const wrap = stream => new Promise((fulfil, reject) => {
  stream.on('finish', () => fulfil());
  stream.on('error', reject);
});

async function log(req, next) {
  const start = Date.now();
  const res = await next();

  await res.body instanceof stream.Readable
    ? wrap(res.body)
    : res.body;
    
  console.log(`${req.method} ${req.url} (${Date.now() - start}ms): ${res.status} ${JSON.stringify(res.headers)}`);
  
  return response;
}

app.use(log).use((req, next) => ({...}));

It's likely that I've overlooked some crucial constraints. But if we're looking at evolving the built-in Node HTTP APIs, I hope we can take this rare opportunity to do so without being beholden to the way we do things now.

@yeedle
Copy link

yeedle commented Sep 10, 2020

How would you run post-response code with this pattern, for example if you want to immediately return 200 and log stuff to the database afterwards?

@mikemaccana
Copy link

Check out the demo of arc.codes routing.

  • Ordered lists of things (like middleware processed in order) are better handled as arrays
  • Returning a response is done by returning a response
  • Returning a modified request passes it onto the next middleware item

https://arc.codes/reference/functions/http/node/async

@Rich-Harris
Copy link
Author

Rich-Harris commented Sep 10, 2020

@yeedle I show an example of this in the gist — just substitute 'writing to a database' for console.log

Edit: ah, wait, you mean writing to the db in the same handler. I guess you can't, exactly — you'd have to do something like

Promise.resolve().then(() => {
  write_to_db(stuff);
});

return {...};

@wesleytodd
Copy link

I think a better way would be:

.use((req) => {
  const res = new Res()
  res.on('finished', writeToDb)
  return res
})

@geelen
Copy link

geelen commented Sep 10, 2020

I felt exactly the same, and in working on FAB I've ended up having to build a whole middleware API which... well I'm kinda the only person that's really dug into it so this is very much "Glen's first go at this". It's not really documented anywhere so I'll post a summary here first then come back to make some comparisons at the end.


Note: FAB's have one added complexity over the actual middleware API, by the way, which is that a FAB is a compilation of a series of plugins, not a straight NodeJS file. So you kinda don't have a first-class programmatic API into the "middleware" outside of writing a plugin file. So keep that in mind...

The "RequestResponder"

This is basically middleware but I didn't want to invoke the (req,res,next) idea so I called it Responder instead. I'll use the FAB typedefs to explain:

Note: Request, Response, Url are all as defined in the Fetch API. An aside: cross-fetch has the best Typescript types for mocking these objects out in the browser (i.e. best compatibility with the lib dom in .tsconfig)

export type FabResponderArgs = {
  request: Request
  url: URL
  context: FabResponderMutableContext
  cookies: Cookies
  // specific to FABs, ignore for now
  settings: FabSettings
}

export type FabRequestResponder = (
  context: FabResponderArgs
) => Promise<Response | undefined | Request | Directive>

The most common of the return types is Response | undefined, so you get functions like this:

const hello_world = async ({ url }) => {
  if (url.pathname === '/') {
    return new Response(`Hello, world!\n`, {
      status: 200,
      headers: {
        'content-type': 'text/html',
      }
    })
  }
  // you can leave this line off if you have 'noImplicitReturns: false' in .tsconfig
  return undefined
}

A couple of things:

  • All middleware is async, even if it does nothing asynchronous at all. I've seen people hyper-optimise the event loop and avoid calling .then() on pre-resolved promise chains but honestly, we're talking about HTTP latencies here, one extra event loop for calling a promise isn't going to be detectably slow. Might be wrong there...
  • Returning undefined is how you say "I don't care about this request". That's effectively the same as calling next() but now it can be an early-return. The above example works great if you use if (x) return undefined instead. I'll do that for the remaining examples...
  • url is a first-class parameter to the responder. Otherwise 90% of responders would have to start with const url = new URL(request.url)
  • The params are an object so you can deconstruct just the one you're interested in.

Returning a Request

This might not be that applicable outside of FABs, but something pretty common is proxying a request somewhere. For implementation reasons (i.e. some hosting restrictions) we can't always proxy the full request through the server, so we came up with this API:

const api_proxy_middleware = async ({ url }) => {
  const match = url.pathname.match(/^\/api\/(.*))
  if (!match) return undefined

  const [_, upstream_route] = match
  return new Request(`https://api.example.com/${params.route}`)
}

Turns out, it's super handy! We actually broke compatibility with the spec and allow relative URLs here, so new Request('/over/here.instead') gets understood by whatever's hosting the FAB. It's up to the hosting runtime to either perform a fetch and forward the response or construct whatever the hosting platform needs to do to create a proxy request (looking at you, Lambda@Edge).

If you don't have the ability to return a Request you can still keep the code super clean, it's just a bit less flexible where the code runs:

    const [_, upstream_route] = match
-   return new Request(`https://api.example.com/${params.route}`)
+   return fetch(`https://api.example.com/${params.route}`)
}

FabResponderArgs

So this is an object of: { request, url, context, cookies } (plus settings which is FAB-specific). A couple of points:

  • request is immutable (in theory). I only pass clones of the original request object in, to avoid middlewares using it as a lazy way of passing values around. Instead, there's two explicit APIs for middlewares to talk to each other, the replaceRequest Directive (which I'll come back to) and...
  • context is a {[key: string]: any} object which any middleware can read/write to freely. It's the way middlewares are supposed to store information for future middlewares to use.
  • url, as mentioned is a new URL(request.url) which is provided for convenience
  • cookies was originally going to be a part of context, but I want the FAB responders to feel higher-level and so having them pre-parsed (and immutable, once parsed) seemed to just make sense. I want to make this lazy in future, though, so if you never check the cookies you never even parse the headers
  • settings, as mentioned, is about being able to reuse FABs in different environments without recompiling or depending on process.env. It's described here and is probably not relevant to this discussion

Directives

The final piece of the API is the Directive, which at the moment has two options, but is intended to grow as I learn more:

export type Directive = {
  replaceRequest?: Request
  interceptResponse?: ResponseInterceptor
}

replaceRequest

replaceRequest is the counterpart to making the request immutable. Instead of doing req.url = 'https://somewhere.else/', you can do:

const change_hostname = async ({ request, url }) => {
  return {
    // This is how you clone a Request, changing only the URL.
    replaceRequest: new Request(`https://somewhere.else${url.pathname}`, request),
  }
}

This then passes down the chain as normal, so anything that's operating on a purely HTTP basis can do so without any coupling to the later responders.

interceptResponse

export type ResponseInterceptor = (response: Response) => Promise<Response>

One of the things I loved about Rack's stack-based middleware composition was that a middleware could just as easily transform the response on the way out as the request on the way in. So I built it for FABs:

const admin_single_page_app = async ({ url }) => {
  const is_admin_page = url.pathname.startsWith('/admin')
  if (!is_admin_page) return undefined

  // Any 404s on the admin page should be rewritten to the SPA .html file 
  return {
    interceptResponse: async (response) => {
      // We have to return this time, so return the unmodified response if we don't care
      if (response.status !== 404) return response
      // If it _is_ a 404 though, replace the response entirely
      return new Response(ADMIN_SPA_INDEX_HTML, {
        status: 200,
        headers: {
          'content-type': 'text/html',
        }
      })
    }
  }
}

You don't have to completely replace the response, you can modify it (but you still need to return it), for example setting a header.

const add_custom_header = async ({ url }) => {
  return {
    async interceptResponse(response) => {
      response.headers.set('X-CUSTOM-HEADER' 'my-value')
      return response
    }
  }
}

Routing

Something that I've removed from the examples here is FAB's routing layer, which is just built on top of path-to-regexp. There are a few examples here but in brief, it looks like this:

import { FABRuntime } from '@fab/core'

// The router gets injected when the plugin root function is first booted up.
export default function({ Router }: FABRuntime) {
  // If you use route params, a 'params' object is injected as one of the ResponderArgs
  Router.on('/api/:route(.*)', async ({ params }) => {
    return fetch(`https://api.example.com/${params.route}`)
  })
}

We also add a Router.interceptResponse alias as a shorthand when all you want to do is piggyback on every outgoing request, a la @fab/plugin-add-fab-id


So, contrasting how FABs work with your suggestions, there's a couple of points worth making:

For me, I'm trying to provide full functionality with the highest-level API possible, without making things too limited. When combined with Router.on for simple routing, it's extremely concise! But it's really designed with the goals and audience of FABs in mind, not a general-purpose server responses (although I think it could be used for that too!).

Also, using the Fetch standard is a blessing and a curse. It makes the server-side code look exactly like some browser code or a serviceworker, which is great for consistency, but it makes it a bit awkward to do things like clone a Request with a changed URL. However, it's a legit standard, so building helper utilities on top might be the best way to get the ergonomics you want, rather than trying to define an arbitrary object.

I explicitly shied away from const res = await next() because I didn't want every plugin to have to worry about calling the next one, and because I wanted to do things like making replaceRequest explicit rather than each plugin having total control on what gets passed down. Very much a personal stylistic choice though.

Very interested to hear what you think! But yeah, when faced with the choice of adopting the (req, res, next) API or inventing my own, I definitely went the other way. I've been super happy with how this is all coming together, hopefully there's something in there of interest!

@OliverJAsh
Copy link

OliverJAsh commented Sep 10, 2020

+100.

If you want to do something like this today using Express (describing responses as return values), here's something I made: https://github.com/OliverJAsh/express-result-types/blob/6e2b32771cd4aeebf9f3effc51f9e1ef6ead71c2/src/example.ts#L29

I would add that using return values can help to prevent many mistakes which are common today: https://twitter.com/OliverJAsh/status/1304141593096708096.

@yeedle
Copy link

yeedle commented Sep 10, 2020

@wesleytodd res.on('finished') is definitely simpler, but it then depends on the response completing normally. Many time you want to do something after responding regardless of how the response ends up. Using a promise feels boilerplatey, but I guess that's the idiomatic way of having javascript do something eventually while not blocking whatever it is doing now.

@wesleytodd
Copy link

@yeedle This is true with today's implementation, which is why we have the on-finished package. The goal of the original conversation is how we design the next generation of http api, so read my example as if .on('finished', ...) it encompassed all the logic from on-finished into the core api, error handling and all.

@Lucifier129
Copy link

Lucifier129 commented Sep 11, 2020

@yeedle just using finally?

app.use(() =>{
    try {
      return  json({
        answer: 42
      })
    } finally {
      console.log('do something after returning response')
    }
});

@antony
Copy link

antony commented Sep 14, 2020

This is essentially how Hapi works, which is just one of four million reasons you've already heard, that I like it.

@alevosia
Copy link

Reminds me of the AWS Lambda serverless functions. It does seem more intuitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment