Skip to content

Instantly share code, notes, and snippets.

@mitsuhiko
Last active November 26, 2017 11:02
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mitsuhiko/5721107 to your computer and use it in GitHub Desktop.
Save mitsuhiko/5721107 to your computer and use it in GitHub Desktop.

wsgi.input_terminated Proposal

A two step proposal to fix the situation with different behaviors of input streams in WSGI.

WSGI servers have two options to providing the input stream:

  1. Provide wsgi.input as socket file unchanged. This means that wsgi.input_terminated is set to False or not added to the WSGI environ at all. In that case the WSGI application is required to look at the CONTENT_LENGTH and only read up to that point.
  2. Provide wsgi.input as an end-of-file terminated stream. In that case wsgi.input_terminated is set to True and an app is required to read to the end of the file and disregard CONTENT_LENGTH for reading.

Pseudocode for a WSGI implementation:

def get_input_stream(environ):
    stream = environ['wsgi.input']

    # This part is new
    if environ.get('wsgi.input_terminated'):
        return stream

    # This part was necessary before anyways to not accidentally
    # read past the length of the stream.
    return wrap_stream(environ['wsgi.input'],
                       environ['CONTENT_LENGTH'])

The only thing that needs to be changed in the WSGI server is either nothing (for instance wsgiref or any other simple WSGI server that just puts the socket through does nothing) or a server like mod_wsgi or gunicorn that terminate the input stream set the flag wsgi.input_terminated to True when making the WSGI environ.

@mcdonc
Copy link

mcdonc commented Jun 6, 2013

Alternate:

def get_input_stream(environ):
    stream = environ.get('wsgi.input_terminated')
    if stream is None:
        stream = wrap_stream(environ['wsgi_input'],
                             environ['CONTENT_LENGTH'])
    return stream

@mcdonc
Copy link

mcdonc commented Jun 6, 2013

All the servers that actually supply wsgi.input_terminated would alias wsgi.input and wsgi.input_terminated (they would be the same object).

@mcdonc
Copy link

mcdonc commented Jun 6, 2013

Graham pointed out that existing modwsgi deployers could do:

   SetEnv wsgi.input_terminated true

Which makes the boolean more attractive.

@mitsuhiko
Copy link
Author

-1 on your proposal mcdonc because it means that you now need to provide two streams and you need to synchronize them. Imagine what happens if someone wants to make a middleware that wraps the input.

@mitsuhiko
Copy link
Author

Just to be clear: if they are the same object a WSGI middleware would still need to patch both keys.

@mcdonc
Copy link

mcdonc commented Jun 6, 2013

WSGI middleware still may need to patch both keys if it wraps wsgi.input if its wrapper doesn't supply the input termination semantics:

def middleware(environ, start_response):
    content_length = environ['CONTENT_LENGTH']
    wsgi_input = environ['wsgi.input']
    environ['wsgi.input'] = cl_wrapper(wsgi_input, content_length)
    # environ still has 'wsgi.input_terminated' key even though
    # wsgi.input is no longer necessarily terminated
    return otherapp(environ, start_response)

@amol-
Copy link

amol- commented Nov 13, 2017

It looks to me that mcdonc proposal, while being pretty smart, would make the contract more complex.

What's wsgi.input_terminated? Who guarantees it's a wrapper to wsgi.input and not a totally different thing? What happens if I read a bit from one and a bit from the other? This is surely an extreme example, but it means there is a chance for inconsistencies.

Having input_terminated just a bool and stating that the input is and will continue to be only wsgi.input requires to specify less constraints on the documentation of the feature imho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment