Skip to content

Instantly share code, notes, and snippets.

@pjeby
Last active July 24, 2017 10:24
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pjeby/62e3892cd75257518eb0 to your computer and use it in GitHub Desktop.
Save pjeby/62e3892cd75257518eb0 to your computer and use it in GitHub Desktop.
Pre-PEP: The WSGI Response Upgrade Bridging Specification

WSGI Response Upgrade Bridging

Contents

Overview

The Problems

Current Python web frameworks and applications are built mostly on WSGI: a request/response API based on HTTP/0.9's simple request-per-connection model. Web libraries and frameworks offer a wide variety of services for request routing, session management, authentication and authorization, etc., based on this model and working with WSGI.

Modern web protocols, however, including Websockets, HTTP/2, SPDY, and so on, are based on a more sophisticated communication model that doesn't fit very well within WSGI. (For that matter, WSGI doesn't play well with Twisted or asyncio-style asynchronous APIs, either.)

Other web API standards have been proposed or are in the process of being developed, but they are between a rock and a hard place, in that if they aim for compatibility with WSGI, then it is harder to provide new features, but if they focus on providing new features, then compatibility with existing frameworks, middleware, etc. is limited.

At the same time, server developers are stuck in something of a holding pattern. Their servers may have (or want to add) new features, but what if they invest in a proposed API that doesn't pan out? Conversely, what if they get stuck needing to support multiple APIs?

Meanwhile, application developers face their own dilemma as well: since existing Websocket and HTTP/2 APIs cannot be easily (and compatibly) accessed from within WSGI, they are unable to use their application and frameworks' existing code for managing routing, sessions, authentication, authorization, etc., when making use of either Websockets or HTTP/2. Instead, they must duplicate code, or else use sideband communications (e.g. via redis) to link between a server with the needed API and the main code of their application.

But what if we could cut through all three of these dilemmas, in a way that would let us have our existing framework "cake", and get to "eat" our advanced protocols, too?

The Proposed Solution

Since the majority of existing WSGI framework and middleware tools deal mainly with the WSGI request, what if we could keep using WSGI to handle our requests, but use a different API for the responses?

In fact, what if we could use that different API, only for the responses that actually needed it, on a request-by-request basis? That way, for example, we could still use our existing middleware or framework code to make sure that a session has been established, authentication and authorization have been handled, and so on.

Then, our existing framework and app code could send their existing login redirects and error pages. But, once everything is logged in and ready to go, we could finally switch over to that other API, to send the real response -- and still have access to our user objects, routing parameters, etc., within that other API.

And this "real" response wouldn't have to be a single HTTP response, either. It could be a handler of some kind, sending or receiving packets of information via websockets, HTTP/2 push, the asyncio API, or whatever other specialized response APIs are available in the WSGI environment.

What's more, if we could ask for these "other APIs" by name, then we could begin using these other APIs today, right now... and still define standardized Python APIs for these features later. And, developers of these other APIs wouldn't have to convince people to switch away from WSGI, nor struggle to come up with clever ways to "tunnel" their APIs through WSGI in a compatible way.

Therefore, this PEP proposes a mechanism akin to HTTP's Upgrade: process, to allow an existing web framework and/or middleware to handle the initial incoming HTTP request and select an application/controller/view/etc., invoking it with information obtained from the request.

Then, when it's time to respond to the request, the running application can choose to upgrade or "bridge" to using a more advanced API to handle the response (and possibly continue to manage an ongoing connection, depending on the nature of the protocols involved).

(But, if the request doesn't need any special handling, the application can simply issue a standard WSGI response, however it currently does that. So only the parts of an application that need this special handling ever have to use it.)

Example Usage Scenarios

Below are two code samples, showing different use cases, different frameworks, and different "upgraded" APIs. In each case, there is an outer piece of framework-specific code (the request handler), and an inner piece of non-framework, API-specific code (the response handler).

To link the two handlers, a small bit of bridging code (shown in these examples as request.upgrade_to()) is used to request a desired API by name, register the response handler, and return a bridging response: a special WSGI response that tells the server to invoke the response handler in its place.

Please note, however, that these are use case illustrations only. This proposal does not specify any of the APIs shown in these examples, including the request.upgrade_to() method itself!

Also, depending on the framework and API involved, the request and response handlers could be functions, methods, instances, classes, or something else altogether. A framework might not provide an upgrade_to() API of its own (or spell it differently) and an application developer always has the option of creating their own version of it as a utility function. (An example implementation will also be shown later in this spec.)

Example 1: HTTP/2 Response Pushing from inside Django

def main_view(request):
    def http2_handler(server):
        server.push(path='/css/myApp.css', ...)
        server.push(path='/js/myApp.js', ...)
        server.send_response(status=200,
                       headers = [('content-type', 'text/plain')],
                       body='Hello world!'.encode('ascii'))
    return request.upgrade_to('http2', http2_handler)

This example shows a relatively simple use case: adding pushed files to an HTTP response. The assumption here is that any routing, authentication, etc. have been handled by Django by the time the above code runs, and so it just needs to send a response using some non-WSGI/non-Django API: a hypothetical API named http2.

The hypothetical request.upgrade_to(api_name, *args, **kw) method takes a desired API name, looks it up in the WSGI environment, and invokes it to create a bridging response: a special response that tells the WSGI server to use the registered response handler to perform the response, bypassing any middleware that doesn't alter or replace this response.

(Again, please note that the actual http2 API shown is a purely hypothetical illustration, loosely based on the nghttp2 API; this proposal only covers the behavior of request.upgrade_to(), and not its existence or spelling, let alone the behavior of Django or nghttp2.)

Example 2: Websocket Chat from inside a WebOb-based Framework

This next example is more complex, demonstrating how response upgrade bridging can be used to switch to a "conversational" or packet-oriented protocol such as Websockets:

@someframework.route('/chat/:room_id'):   # route to the request handler
def chat(self, request, room_id):          
    # code here looks up room, user, etc.
    # can redirect to login/registration
    # validate room existence, etc.
    # using the web framework's request and other tools
    ...
    # Ready to chat? Define a handler for the websocket API:
    def websocket_handler(sock):
        # code here has access to request/room
        # *plus* whatever it gets passed by the websocket API
 
        sock.send("Welcome to the %s room, %s" % (room.name, user.name))
        room.sockets[user.name] = sock

        def sendall(msg):
            data = msg.encode('utf8')
            for s in room.sockets.values():
                s.send(data)

        sendall("%s has entered the chat room" % user.name)

        @sock.on_receive
        def receive_handler(data):
            sendall("%s: %s" % user.name, data.decode('utf8'))

        @sock.on_close
        def close_handler():
            if room.sockets.get(user.name) is sock:
                delete room.sockets[user.name]
            
        # etc...

    return request.upgrade_to('websockets', websocket_handler)

Again, note that this websockets API is purely hypothetical; the point of this illustration is merely to show that response-upgrade bridging isn't limited to synchronous control flow or a single request-response pair. Upgraded response APIs can be event driven, callback-based, generator-oriented, or almost anything at all.

So, while both of these examples show:

  1. An outer function, used as a request handler
  2. An inner function, used as a response handler, and
  3. A request.upgrade_to() function, used to register the response handler and generate a bridging response

Please note again that none of these three parts have to be implemented in the ways shown above. The request handler could have been a class, instance, or method, depending on the web framework in use, and the same is true for the response handler, depending on the API being bridged to. (And, as previously mentioned, request.upgrade_to() is a short bit of glue code that can be written by hand.)

Proposal Scope

Goals of this proposal include:

  1. Defining a way for WSGI applications, at runtime (i.e., during the execution of a request), to detect the existence of, and access, upgraded non-WSGI server APIs which can be used in place of WSGI for either effecting a response to the current request, or initiating a more advanced communications protocol (such as websocket connections, associated content pushing, etc.) as an upgrade to the current request.

  2. Defining ways for WSGI middleware to:

  3. Continue to be used for request routing and other pre-response activities for all requests, as well as post-response activities for requests that do not require bridged API access

  4. Intercept and assume control of any bridged APIs to be used by wrapped applications or subrequests (assuming the middleware knows how to do this for a specific bridged API, and desires to do so)

  5. Disable any or even all bridged API access by its wrapped apps -- even without prior knowledge of which APIs might be used -- in the event that the middleware can only perform its intended function by denying such access

  6. Defining a way for WSGI servers to negotiate a smooth transition of response handling between standard WSGI and their native API, while safely detecting whether intervening middleware has taken over or altered the response in a way that conflicts with elevating the current request to native API processing

Non-goals include:

  • Actually defining any specification for the bridged APIs themselves ;-)

Specification

The basic idea of this specification is to add a dictionary to the WSGI environment, under the key wsgi.upgrades. Within this dictionary, a single ASCII string key is allocated for each non-WSGI API offered by the server (or implemented via middleware).

So, for example, if Twisted were to offer an upgrade bridge, it might register a twisted key within the wsgi.upgrades dictionary. And if uWSGI were to offer a websocket API bridge, it might register a uwsgi.websocket key (perhaps conditionally on whether the current request included a websocket upgrade header), and so on.

The registered key in the wsgi.upgrades dictionary MUST be an ASCII string containing a dot-separated sequence of one or more valid Python identifiers. (So, http2 and http.v2 are valid API keys, but http.2 and http/2 are NOT.)

The registered value, on the other hand, is a callable used to create a bridge between a web application's request handler, and a handler for the upgraded (non-WSGI, non-web framework) API.

Providing an API

The implementation of an upgrade bridge consists of a callable object, looking something like this pseudocode:

def some_api_bridge(environ, start_response, XXX...):
    response_key = new_unique_header_compatible_string()
    current_request.response_registry[response_key] = XXX...
    start_response('399 WSGI-Bridge: '+response_key, [
        ('Content-Type', 'application/x-wsgi-bridge; id='+response_key),
        ('Content-Length', str(len(response_key)))
    ])
    return [response_key]

environ.setdefault('wsgi.upgrades',{})['some_api'] = some_api_bridge

As you can see, this is a little bit like a WSGI application -- and in fact it is a valid WSGI application, except that one or more positional or keyword arguments (shown here as XXX...) are included after the standard WSGI ones, to specify details of the desired response handler. Depending on the needs of the API, these arguments could be a single "handler" callback, or they could be multiple objects, callbacks, or configuration values.

The upgrade bridge's job is simply to generate a unique ASCII "native string" key to be used in the bridging response as a substitute for these additional arguments, and to register these arguments under that key for future use by the server. Finally, the bridge sends a WSGI response as shown above, with the status, headers, and body all containing the generated response key.

(Note: the bridge MUST return a single-item sequence as its response and MUST NOT use the WSGI write() facility, so that it's easier to write glue code for frameworks that don't support directly returning a WSGI response.)

The server MUST NOT actually invoke or begin using the provided handler until after the standard WSGI response process has been completed, and it has verified that the response key is still present in all three parts of the WSGI response: the status, headers, and body.

The continued presence of the response key is used to verify three things:

  1. That the registered response handler is indeed a response to the original incoming request, and not merely a response to a subrequest created by middleware

  2. That intervening middleware hasn't replaced the bridging response with a response of its own (for example, an error response created because of an error occurring after the bridged handler was registered, but before it was used)

  3. Which response handler should be invoked, if more than one was registered

So, a server providing an upgrade bridge MUST wait until it receives a WSGI response whose status, content-type, content-length, and body all unequivocally identify which of the response handlers registered for the current request should actually be used.

In the event that the status, type, and body all match each other, the server MUST then activate the registered response handler for that key, allowing the current request (and possibly subsequent requests, depending on the API involved) to be handled via the associated API. (It also MUST discard any other registered response handlers for the current request.)

In the event that neither the status nor headers designate a registered response handler, the server MUST treat the response as a standard WSGI response, and discard all registered response handlers for the current request.

In the event that the status and headers disagree on which handler is to be used (or whether one is to be used at all), or in the event that they do agree, but the body disagrees with them, or if all three agree but the supplied ID was not registered for this request or API, then the server MUST generate an error response, and discard both the WSGI response and any registered handlers. (In the face of ambiguity, refuse the temptation to guess; errors should not pass silently.)

Response Key Details

The key used to distinguish responses MUST be an ASCII "native string" (as defined by PEP 3333). It SHOULD also be relatively short, and MUST contain only those characters that are valid in a MIME "token". (That is, it may contain any non-space, non-control ASCII character, except the special characters (, ), <, >, @, ,, ;, :, \, ", /, [, ], ?, and =.)

Response keys generated for a given API MUST be unique for the duration of a given request, and MUST be generated in such a way so as not to collide with keys issued for any other API during the same request. (e.g., by including the API's name in them.)

Response keys SHOULD also be unique within the lifetime of the process that generates them, e.g. by including a global counter value.

(So, the simplest way of generating a response key that conforms to this spec is to just append a global counter to a string uniquely identifying the chosen API. However, there is nothing stopping a server from adding other information like a request ID, channel desginator, or other information in, as an aid to debugging. Just make sure there's no whitespace or special characters involved, as mentioned above.)

Closing and Resource Management

Because the bridging response may have been wrapped by middleware -- e.g. session middleware that saves updated session data on .close(), database connection-pooling middleware that releases connections on .close(), etc. -- the server MUST NOT invoke the WSGI response's .close() method (if any) before the new response handler is finished, in order to prevent premature resource release.

If the response protocol implements something like websockets, or an extended HTTP/2 conversation, then the provided API SHOULD provide some way for the response handler to explicitly ensure that the response .close() method is called, at some point before the conversation is completed and the connection is closed.

These two requirements exist because even if the response content is not altered by middleware, it is still possible for middleware to attach resource-release handlers to the WSGI response object. If these are not closed at all, or closed prematurely, it may cause problems with the underlying web framework.

For example, some web frameworks offer a facility to tie database transaction scope to request scope, so that when a request is completely finished, the current transaction is automatically committed, and a database connection may be returned to a pool. A response handler might then be in the position of trying to use a connection that no longer "belonged" to it.

In the simpler, more common case of a single response to a single request, deferring the .close() operation until the entire response is completed will help to preserve existing framework behavior and user expectations, so long as the framework is using a .close()-based mechanism to control these other features.

Conversely, in the case where an extended conversation takes place, the user may wish to signal completion earlier, in order to avoid hanging on to unnecessary resources.

Of course, if a framework uses some other mechanism to allocate its connections, scope its transactions, or do other resource management, that may impose certain limitations on the user with respect to what framework features are still usable within a given response handler.

Web frameworks supporting this spec MUST document what framework features will be unavailable from within a bridged API response handler (i.e. after the framework request handler returns a response), and SHOULD provide alternate ways to access those features from a response handler.

Further, a framework MAY intercept and wrap registered response handlers (for APIs whose control flow they understand) in order to transparently provide these features. (However, since this has to be done on an API-by-API basis, it's likely that most framework providers will only offer this interception feature for a few, community-standardized APIs. But they may -- and perhaps already do -- expose APIs that would let others do the necessary wrapping or interception themselves.)

Accessing an API

Now that we have seen both the application and server sides of the bridging process, we can look at the bridge itself. Essentially, the bridging is done by:

  1. Retrieving the appropriate upgrade bridge from the environ

  2. Invoking that bridge as if it were a WSGI application, passing any extra arguments required by the specific bridged API (such as a handler)

  3. Returning the bridge's WSGI response, as the WSGI response of the current app or framework.

Here's an example, using a pure WSGI app and no web framework:

def my_wsgi_app(environ, start_response):

    foobar_api = environ.get('wsgi.upgrades', {}).get('foobar')

    if foobar_api is None:
        # appropriate error action here
        # i.e. raise something, or return an error response

    def my_foobar_handler(foobar_specific_arg, another_foobar_arg, etc...):
        # code here that uses the foobar API to do something cool
                        
    # Delegate the WSGI response to the foobar API
    return foobar_api(environ, start_response, my_foobar_handler) 

However, since most application code isn't pure WSGI and does use a framework, here's an example of how Django's WSGIRequest class might implement our previously-illustrated request.upgrade_to() method:

def upgrade_to(self, api_name, *args, **kw):

    api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name)
    if api_bridge is None:
        raise RuntimeError("API unavailable")

    # Capture the bridging response as a Django response:
    response = StreamingHttpResponse()
    
    def start_response(status, headers):
        code, reason = status.split(' ', 1)
        response.status_code = int(code)
        response.reason_phrase = reason
        for h, v in headers:
            response[h] = v

    response.streaming_content = api_bridge(self.environ.copy(), start_response)
    return response

And here's the webob.Request version of the same functionality (which is a lot simpler, since WebOb already provides a way to capture a WSGI app as a response):

def upgrade_to(self, api_name, *args, **kw):
    api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name)
    if api_bridge is None:
        raise RuntimeError("API unavailable")
    return self.send(lambda env, s_r: api_bridge(env.copy(), s_r, *args, **kw))

Individual web frameworks can of course decide how best to expose this functionality to their users, whether via a request or response method, controller method, special object to return, exception to raise, or whatever other approach best suits their framework's API paradigm.

(And of course, as long as the framework provides access to the WSGI environ, and allows setting every aspect of the WSGI response, an application developer can implement their own variation of the above, without any extra assistance from the framework itself.)

Intercepting, Disabling, or Upgrading API Bridges

Because all API upgrade bridges are contained in a single WSGI environment key, it is easy for WSGI middleware to disable access to them when creating subrequests, by simply deleting the entire wsgi.upgrades key before invoking an application.

Likewise, in the event that WSGI middleware wishes to disable one specific API, or intercept it, it can do so by removing or replacing the appropriate bridge in the upgrades dictionary.

Last, but far from least, WSGI middleware can add new bridges to the environment, though it should usually only do so if it implements the new bridge in terms of a bridge that already exists. (For example, to provide a standardized wrapper over a server's native API, or to emulate one server's API in terms of another server's API.)

These "middleware bridges" should work by delegating the actual bridging process to the base API, e.g.:

def api_standardizing_middleware(app):
    def standard_api_bridge(environ, start_response, std_handler):
        def native_handler(...):
            # translate/wrap native args to std args, then pass them on
            std_handler(...)           
        native_api = environ['wsgi.upgrades']['native_api']
        return native_api(environ, start_response, native_handler)

    def wrapped_app(environ, start_response):
        upgrades = environ.setdefault('wsgi.upgrades', {}) 
        if 'native_api' in upgrades:
            upgrades['standard_api'] = standard_api_bridge
        return app(environ, start_response)

    return wrapped_app

In this example, we show a piece of middleware that converts some server's native API (native_api) to some Python standard API (standard_api), if the required native API is available at request time. It doesn't have to implement any other part of the bridging specification, since the server's native API bridge will register and invoke the native response handler (native_handler), which in turn will invoke the "standardized" handler (std_handler).

So, all the middleware needs to do is accept handler arguments for the API it wants to provide, and then register a linked handler with the native API. (Apart from the code shown above, everything else is just whatever is needed to implement the actual API translation.)

This means that if a server exposes whatever its native API is, then any number of translated, standardized, or simplified versions of that API can be offered via middleware, without needing to alter the server itself, or the server's core WSGI implementation. Instead, those other APIs can just be implemented via the existing native API bridge.

(Note: The wsgi.upgrades dictionary is to be considered volatile in the same way as the WSGI environment is. That is, apps or middleware are allowed to modify or delete its contents freely, so a copy MUST be saved by middleware if it wishes to access the original values after it has been passed to another application or middleware.)

Next Steps

Once this specification is stable, the next step is to implement native server API bridges for existing web servers. These do not necessarily need to be provided by the server implementers themselves, but they do need to be implemented in the server's native API, and extend its WSGI implementation.

Because it is possible for API bridges to be layered or upgraded by standard WSGI middleware, it is not necessary for servers to directly support multiple APIs. Servers can simply expose their existing API as an API bridge, and let third parties implement middleware to translate that API to any future standardized APIs.

As soon as even one such native API exists, it is immediately beneficial for web frameworks to provide support for the bridging API, and possible for framework users to supply their own. (WebOb support would be especially useful, since a significant number of web frameworks base their request and response objects on WebOb.)

It may also be helpful to publish a reference library for response key generation and response verification, along with perhaps a wsgiref update or at least some sample code showing how to modify the wsgiref request handler flow to initiate a bridge operation.

Open Questions and Issues

  • Transaction and object lifetimes -- is the current spec correct/sufficient?
  • What if middleware adds headers but leaves the status and content-type unchanged? Should that be an error? What happens if middleware requests setting cookies?
  • Do the chosen status/headers/body signatures actually make sense? Do they even need to be more specified, less-specified?
  • Are there any major obstacles to sending a special status from major web frameworks?
  • Should a different status be used?
  • Are there any other ways to corrupt, confuse, or break this?
  • What else am I missing, overlooking, or getting wrong?

Notes on the Current Design Rationale

  • A dictionary is used for all bridged APIs, so they can be easily disabled for subrequests

  • Multiple registrations are allowed, so that middleware invoking multiple subrequests is unaffected, so long as exactly one subrequest's response is returned to the top-level WSGI server

  • A Content-Type header is part of the spec, because most response-altering middleware should avoid altering content types it does not understand, thereby increasing the likelihood that the response will be passed through unchanged

Acknowledgements

(TBD, but should definitely include Robert Collins for research, inspiration, and use cases)

References

TBD

Copyright

This document has been placed in the public domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment