In the PSGI specification we define a basic and encapsulated API for reading the 'body' part of an incoming HTTP message. In practice several use cases are not met:
The PSGI specification allows for an PSGI server to buffer the input stream. When this happens a server must set 'psgix.input.buffered' as true. Although this buffering has some value since it can allow a server to use highly optimized code for quickly reading the message body, the choice for when a request is buffered or not is under the control of the server, not the application. There are cases when an application wants more control over the reading and buffering process, for example if the expected upload is very large and only a small part of the data is important. In that case you may wish to read the input in chunks, examine chunks for the information you seek and discard the rest. Additionally the application cannot control what type of buffering storage is used. For example if an upload is very large, an application may wish to aim the storage at some sort of dedicated storage system such as Amazon S3, rather than be forced to first store it locally (in memory or in the temporary directory) and then re-read it and send it to the alternative storage. It would be ideal if one could real the input in lines and send the lines to the storage of choice. Lastly this can cause trouble for developers when the target production system is using a server that does not buffer (such as FastCGI) and a development server that does (such as Starman). This difference can lead to issues where code must be different in different environments.
We propose that an environment flag such as "PSGI_DISALLOW_INPUT_BUFFERING" be added to the PSGI specification. When this flag is true, a server should not buffer, even if it has buffering logic build into it.
Open Questions
HTTP::Body overlaps here in non useful ways and doesn't really do streaming, etc. We need to figure out what those overlap points are and what changes need to happen to HTTP::Body, if any. Or at least determine if we can ignore for now!
What would the error conditions and responses be (like if halfway thru the read the connection is lost?
When a server is nonblocking, reading the input stream in a non blocking way is event loop dependent.
Currently if a server is running under an event loop, the way for reading input in a non blocking way tend to be loop specific and force programmers to target one event loop management system. Since there is no clear winner in the various event loops one can use with Perl, it would be better if you could write basic code without tying your application to a system.
We propose for inproving interoperability between servers that run under various event loops (such as AnyEvent, EV, etc.) when a server is running this (and sets the "psgi.nonblocking" key to true), such a server should expose a to be determined interface for reading the body message in a non blocking manner (for example we could return a Promise object which is a reasonable non opinionated and broadly accepted approach).
Open Questions
What would the error conditions and responses be (like if halfway thru the read the connection is lost? (Same question as previous proposal, perhaps the two can be answered together?)
'psgix.io' is intended to be a low level escape hatch for the server to expose a raw, bidirectional socket to the application. Specification says, "the raw IO socket to access the client connection to do low-level socket operations. This is only available in PSGI servers that run as an HTTP server, and should be used when (and only when) you want to jailbreak out of PSGI abstraction, to implement protocols over HTTP such as BOSH or WebSocket."
Now that its becoming more common to write applications using protocols that run over HTTP (such as Websockets) it would be great if you could have an interface for 'psgix.io' that would be the same across servers. That way you're code would not be server dependent.
A survey of common existing PSGI servers on CPAN seem to indicate that "IO::Socket::INET" is a reasonable common API choice.
HTTP::Server::PSGI bless( \*Symbol::GEN2, 'IO::Socket::INET' );
Twiggy \*{'AnyEvent::Socket::'};
Starman bless( \*Symbol::GEN1, 'Net::Server::Proto::TCP' );
Feersum ????
Net::Async::HTTP::Server bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Corona bless( \*Symbol::GEN2, 'Net::Server::Proto::TCP' );
Starlet bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Thrall bless( \*Symbol::GEN4, 'IO::Socket::INET' );
Starlight bless( \*Symbol::GEN1, 'IO::Socket::INET' );
Monoceros \*{'Monoceros::Server::$fh'};
We propose to defined a minimum interface for 'psgix.io' that could be agnostic across servers. This interface would include a way to commonly handle and expose error conditions. Use cases would include long polling / websockets interfaces that work on more than one server. Ideally like the proposal for psgi.input we'd have a way to have a minimum agnostic approach to expose an evented interface when an event loop is present, such that one can change loops and still expect code to work.
PSGI allows one to use a 'streaming, delayed' approach:
my $app = sub {
my $env = shift;
return sub {
my $responder = shift;
my $writer = $responder->(
[ 200, [ 'Content-Type', 'application/json' ]]);
# do something with $writer
};
};
In practice unmet use cases arise.
In the case when you want to write nonblocking, streaming code, you have to write code that is strongly tied to the server event loop. Ideally we'd expose an interface that would allow one to write non blocking code that was not so.
When a stream fails midpoint there is not clear way to respond to and handle this error.
Possible replace Promises with Future