Skip to content

Instantly share code, notes, and snippets.

@jjn1056
Last active September 14, 2015 15:10
Show Gist options
  • Select an option

  • Save jjn1056/a2ca6c0e7cfb2258654d to your computer and use it in GitHub Desktop.

Select an option

Save jjn1056/a2ca6c0e7cfb2258654d to your computer and use it in GitHub Desktop.

The Input Stream

In the PSGI specification we define a basic and encapsulated API for reading the 'body' part of an incoming HTTP message. In practice several use cases are not met:

An application cannot control if a PSGI handler implements buffering or not.

The PSGI specification allows for an PSGI server to buffer the input stream. When this happens a server must set 'psgix.input.buffered' as true. Although this buffering has some value since it can allow a server to use highly optimized code for quickly reading the message body, the choice for when a request is buffered or not is under the control of the server, not the application. There are cases when an application wants more control over the reading and buffering process, for example if the expected upload is very large and only a small part of the data is important. In that case you may wish to read the input in chunks, examine chunks for the information you seek and discard the rest. Additionally the application cannot control what type of buffering storage is used. For example if an upload is very large, an application may wish to aim the storage at some sort of dedicated storage system such as Amazon S3, rather than be forced to first store it locally (in memory or in the temporary directory) and then re-read it and send it to the alternative storage. It would be ideal if one could real the input in lines and send the lines to the storage of choice. Lastly this can cause trouble for developers when the target production system is using a server that does not buffer (such as FastCGI) and a development server that does (such as Starman). This difference can lead to issues where code must be different in different environments.

We propose that an environment flag such as "PSGI_DISALLOW_INPUT_BUFFERING" be added to the PSGI specification. When this flag is true, a server should not buffer, even if it has buffering logic build into it.

Open Questions

HTTP::Body overlaps here in non useful ways and doesn't really do streaming, etc. We need to figure out what those overlap points are and what changes need to happen to HTTP::Body, if any. Or at least determine if we can ignore for now!

What would the error conditions and responses be (like if halfway thru the read the connection is lost?

When a server is nonblocking, reading the input stream in a non blocking way is event loop dependent.

Currently if a server is running under an event loop, the way for reading input in a non blocking way tend to be loop specific and force programmers to target one event loop management system. Since there is no clear winner in the various event loops one can use with Perl, it would be better if you could write basic code without tying your application to a system.

We propose for inproving interoperability between servers that run under various event loops (such as AnyEvent, EV, etc.) when a server is running this (and sets the "psgi.nonblocking" key to true), such a server should expose a to be determined interface for reading the body message in a non blocking manner (for example we could return a Promise object which is a reasonable non opinionated and broadly accepted approach).

Open Questions

What would the error conditions and responses be (like if halfway thru the read the connection is lost? (Same question as previous proposal, perhaps the two can be answered together?)

PSGIX.IO

'psgix.io' is intended to be a low level escape hatch for the server to expose a raw, bidirectional socket to the application. Specification says, "the raw IO socket to access the client connection to do low-level socket operations. This is only available in PSGI servers that run as an HTTP server, and should be used when (and only when) you want to jailbreak out of PSGI abstraction, to implement protocols over HTTP such as BOSH or WebSocket."

Using PSGIX.IO creates an unacceptable binding between the application and the server.

Now that its becoming more common to write applications using protocols that run over HTTP (such as Websockets) it would be great if you could have an interface for 'psgix.io' that would be the same across servers. That way you're code would not be server dependent.

A survey of common existing PSGI servers on CPAN seem to indicate that "IO::Socket::INET" is a reasonable common API choice.

HTTP::Server::PSGI            bless( \*Symbol::GEN2, 'IO::Socket::INET' );
Twiggy                        \*{'AnyEvent::Socket::'};
Starman                       bless( \*Symbol::GEN1, 'Net::Server::Proto::TCP' );
Feersum                       ????
Net::Async::HTTP::Server      bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Corona                        bless( \*Symbol::GEN2, 'Net::Server::Proto::TCP' );
Starlet                       bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Thrall                        bless( \*Symbol::GEN4, 'IO::Socket::INET' );
Starlight                     bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Monoceros                     \*{'Monoceros::Server::$fh'};

We propose to defined a minimum interface for 'psgix.io' that could be agnostic across servers. This interface would include a way to commonly handle and expose error conditions. Use cases would include long polling / websockets interfaces that work on more than one server. Ideally like the proposal for psgi.input we'd have a way to have a minimum agnostic approach to expose an evented interface when an event loop is present, such that one can change loops and still expect code to work.

The Streaming Write Interface

PSGI allows one to use a 'streaming, delayed' approach:

my $app = sub {
  my $env = shift;
  return sub {
    my $responder = shift;
    my $writer = $responder->(
    [ 200, [ 'Content-Type', 'application/json' ]]);
    # do something with $writer
  };
};

In practice unmet use cases arise.

The $writer object has no interface for interoperating with an server event loop.

In the case when you want to write nonblocking, streaming code, you have to write code that is strongly tied to the server event loop. Ideally we'd expose an interface that would allow one to write non blocking code that was not so.

The $writer object has no interface for reporting error conditions in the writing process.

When a stream fails midpoint there is not clear way to respond to and handle this error.

@jjn1056
Copy link
Copy Markdown
Author

jjn1056 commented Aug 21, 2015

Possible replace Promises with Future

@zostay
Copy link
Copy Markdown

zostay commented Sep 14, 2015

I think I can implement all of this in Perl 6 without too much difficulty. Some things are almost part of P6SGI already because Perl 6 is async by definition and P6SGI starts by requiring applications be implemented through Promises and Supplies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment