Skip to content

Instantly share code, notes, and snippets.

@t0yv0
Created June 30, 2011 01:07
Show Gist options
  • Save t0yv0/1055424 to your computer and use it in GitHub Desktop.
Save t0yv0/1055424 to your computer and use it in GitHub Desktop.
An attempt to improve on OWIN<http://owin.org> in F#.
(*
// # FSCGI - F# Common Gateway Interface
//
// This gist is a response to OWIN <http://owin.org/>.
// This gist is public domain.
//
// See also the discussion stating OWIN rationale and how it is better than FSCGI:
// http://groups.google.com/group/net-http-abstractions/browse_thread/thread/ac3d7c1e3d43c1d4
//
// ## Problem Summary
//
// The goal of OWIN is to provide a low-level .NET standard for web apps to
// communicate with their environments, filling the similar role as Java
// Servlet Specification, FastCGI, SCGI (Python) and the like.
//
// OWIN as of Mar 13, 2011 is problematic in the following respects:
//
// * poor use of static typing, for example uses Dictionary<String,Object>
// * gratuitous complexity of the definition (nested Func<_> callbacks)
// * reliance on prose to communicate invariants that can be type-enforced
//
// ## Solution Summary
//
// * use more explicit typing
// * use an iteratee-based representation for the IO process
// * simplify exception handling by the rule: application code MUST NOT throw
// any exceptions; doing so is indicating a programming error and will be
// treated in a host-dependent way.
//
*)
/// Defines the common gateway interface protocol for F#.
namespace FSCGI
type Data = System.ArraySegment<byte>
type Headers = Map<string,string>
type StatusCode = int
type Status = string
type Request =
{
Headers : Headers
Method : string
Path : string
PathBase : string
QueryString : string
Scheme : string
}
type State =
| Closed
| Open
type Writer =
| Done
| Write of (State -> Data * Writer)
type Response =
| Read of (option<Data> -> Response)
| Respond of StatusCode * Status * Headers * Writer
type Application =
Request -> Response
/// Converts structural encodings to proper FSCGI.* types.
module FSCGI.Structural.Converter
type private Encodings<'R,'W> =
('W -> Writer<'W>) *
('R -> Response<'R,'W>)
let private ConvertRequest (r : FSCGI.Request) : Request =
(
r.Headers,
r.Method,
r.Path,
r.PathBase,
r.QueryString,
r.Scheme
)
let rec private ConvertWriter ((eW, _) as enc : Encodings<'R,'W>)
(w: Writer<'W>) : FSCGI.Writer =
match w with
| Choice1Of2 () ->
FSCGI.Done
| Choice2Of2 f ->
FSCGI.Write (fun state ->
let isOpen =
match state with
| FSCGI.Closed -> false
| FSCGI.Open -> true
let (data, writer) = f isOpen
(data, ConvertWriter enc (eW writer)))
let rec private ConvertResponse ((eW, eR) as enc : Encodings<'R,'W>)
(r: Response<'R,'W>) : FSCGI.Response =
match r with
| Choice1Of2 f ->
FSCGI.Read (fun d -> ConvertResponse enc (eR (f d)))
| Choice2Of2 (a, b, c, d) ->
FSCGI.Respond (a, b, c, ConvertWriter enc (eW d))
let Convert ((eW, eR, run): Application<'R,'W>) : FSCGI.Application =
ConvertResponse (eW, eR) << run << ConvertRequest
/// Provides structural encodings for the FSCGI.* types.
namespace FSCGI.Structural
type Data = System.ArraySegment<byte>
type HeaderName = string
type HeaderValue = string
type Headers = Map<HeaderName,HeaderValue>
type Method = string
type Path = string
type PathBase = string
type QueryString = string
type Scheme = string
type StatusCode = int
type Status = string
type Request =
Headers * Method * Path * PathBase * QueryString * Scheme
type Writer<'W> =
Choice<
unit,
bool -> Data * 'W
>
type Response<'R,'W> =
Choice<
option<Data> -> 'R,
StatusCode * Status * Headers * 'W
>
type Application<'R,'W> =
('W -> Writer<'W>) *
('R -> Response<'R,'W>) *
(Request -> Response<'R,'W>)
@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

Thanks for the questions, it made me check the definition and spot that it did not compile.

As you spotted, the application type was wrong, and so was the Response type.

Basically structural types do not allow us to construct recursive types directly, therefore we need to encode them in generics, and ask for the user to provide the bijection between 'W and Writer and 'R and 'Response. Luckily, we only seem to need the projection part ('W -> Writer) to be able to reconstruct FSCGI.Application from FSCGI.Structural.Application.

I attach the conversion code to make it explicit.

The encoding makes 'W and Writer<'W> interchangeable. Turns out we also need 'R for Response encoding.

@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

To clarify structural encodings even more, what we expect the user to do is to define:

type W = W of Writer<W>
type R = R of Response<R,W>

let App : Application<R,W> =
  ((fun (W x) -> x), (fun (R x) -> x), ...)

@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

I am amazed at the amount of notice Node.js gets, to us F# bigots it is totally unwarranted! :) I am not familiar with its workings, but I strongly suspect that what they are doing falls into the general framework of becoming more explicit about scheduling and doing it at the language level.

The trend is towards tackling async programming with language-level cooperative threads. State of the art for me is OCAML LWT "light-weight threading" - by now old and proven. OCAML's single-process model dictated this kind of a model where processes are encoded as 'a Lwt.t values with a monadic interface. Process interleaving is done by the library itself during the bind of the monad (hence cooperative behavior). OCAML users then found out that it actually performs great!

With WebSharper we are in the same boat on top of the JavaScript runtime - we have no threads available to the language. For the upcoming release we just did the LWT trick - Joel Bjornson re-implemented Async support to make use of a round-robin scheduler.

Iteratees do the same, they are just specialized for IO. But what they actually do, if you think about it, they encode processes as explicit state machines, with every state a value and every transition a function. Essentially the same as LWT, except the states are more constrained.

@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

The last commit eliminates the flexibility to start responding before the reading is complete, fixing the sequence of events: read-request, emit-headers, write-response - making the definition a bit simpler.

@panesofglass
Copy link

Funny, I was looking at the various Iteratee implementations, looked at some state machine code I had written for comparison, and was going to ask you whether Iteratee was essentially a constrained state machine, which you have already noted. :) I need to look into OCaml LWT. I'm not familiar with that.

@panesofglass
Copy link

Before I throw this into Frank, let me make sure I understand correctly. In this implementation, there is no Async, correct? This is all Iteratee-based? In other words, it uses a lazy state machine approach to construct the next chunk as needed rather than asynchronously accessing the underlying Stream? Would the underlying Stream access use Async? Is AsyncSeq a potentially appropriate mechanism for reading each chunk?

@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

Yes this is completely synchronous on purpose. The application wouldn't be able to use async in any other way then Async.RunSynchronously. The host might use Async or something else to multiplex serving several requests / application instances. I was thinking that after all that's the only difference in functionality to OWIN - in OWIN the application can do asynchronous stuff because OWIN gives it result-expecting callbacks.

Also to clear things up a bit, if we would have a mutable interface for the application, here's what would happen to it:

app.Request <- ...
while (input.DataAvailable && app.AcceptsData) do
    app.Accept(input.Read())
output.Write(app.GetResponse())
while (app.HasMoreData()) do
   output.Write(app.Read())

The F# hackery above is just trying to guarantee that these things happen in the order listed. The host also has a good chance to multiplex several apps:

app.Request <- ...
while (input.DataAvailable && app.AcceptsData) do
    (* here we might switch between applications *)
    app.Accept(input.Read())
output.Write(app.GetResponse())
while (app.HasMoreData()) do
   (* here we might switch between applications *)
   output.Write(app.Read())

@panesofglass
Copy link

The reason I added AsyncBody versions of the Body types in the new Frank signatures was to provide applications the ability to say, "I need to do some look-ups on my own; check back with me in a bit." Your examples would be useful in the case that an application is doing some immediate results, but I don't think it addresses the common need to call out to a database or web service from the server. Or am I missing that aspect?

@panesofglass
Copy link

Hmm... I suppose the application could use Async.RunWithContinuations and supply the Response write mechanism as a callback. That should still facilitate what you've described above. The application would then control any internal asynchronicity.

@t0yv0
Copy link
Author

t0yv0 commented Jun 30, 2011

Spot on, the "check back in a bit" scenario is prohibited. The application will be stuck to blocking (doing Async.RunSynchronously) if it wants to talk to the network or the database.

If we allow this "check back in a bit", I think the interface quickly becomes isomorphic to OWIN..

What I am struggling to grasp right now - is are these "check back in a bit" scenarios really safe? Or, how do we write apps that are safe?

It is just so easy to trip. Consider Petricek's AsyncSeq: readInBlocks is broken because it does not close the file descriptor. If it were to close it, when would it? And how can the reader signal lack of interest in the rest of the sequence?

@panesofglass
Copy link

Lots to think about. With Fracture, we have the pipeline model which uses agents to progress things along. We could always leverage that within Frank to allow for delayed or long-running work to not block the current stuff. Of course, if we have a number of agents running, they are not preventing the primary server from blocking anyway, so we're probably safe. Another aspect we are planning is to have load management be able to spin up new agents as necessary, so again, blocking within a given application shouldn't affect the overall system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment