Skip to content

Instantly share code, notes, and snippets.

@heapwolf
Forked from dominictarr/pull-streams.md
Last active November 26, 2015 10:59
Show Gist options
  • Save heapwolf/d3cd1cdca46e90011ad0 to your computer and use it in GitHub Desktop.
Save heapwolf/d3cd1cdca46e90011ad0 to your computer and use it in GitHub Desktop.

Synopsis

In Pull-Streams, there are two fundamental types of streams Sources and Sinks. There are two composite types of streams Through (aka transform) and Duplex. A Through Stream is a sink stream that reads what goes into the Source Stream, it can also be written to. A duplex stream is a pair of streams ({Source, Sink}) streams.

Pull-Streams

Source Streams

A Source Stream (aka readable stream) is a async function that may be called repeatedly until it returns a terminal state. Pull-streams have back pressure, but it implicit instead of sending an explicit back pressure signal. If a source needs the sink to slow down, it may delay returning a read. If a sink needs the source to slow down, it just waits until it reads the source again.

Read

A method, for example read(null, cb(end|err)), will read data from the stream zero or more times. The read method must not be called until the previous call has returned, except for a call to abort the stream.

End

The stream may be terminated, for example cb(err|end). The read method must not be called after it has terminated. As a normal stream end is propagated up the pipeline, an error should be propagated also, because it also means the end of the stream. If cb(end=true) that is a "end" which means it's a valid termination, if cb(err) that is an error. error and end are mostly the same. If you are buffering inputs and see an end, process those inputs and then the end. If you are buffering inputs and get an error, then you may throw away that buffer and return the end.

Abort

Sometimes it's the sink that errors, and if it can't read anymore then we must abort the source. (example, source is a file stream from local fs, and sink is a http upload. prehaps the network drops or remote server crashes, in this case we should abort the source, so that it's resources can be released.)

To abort the sink, call read with a truthy first argument. You may abort a source before it has returned from a regular read. (if you wait for the previous read to complete, it's possible you'd get a deadlock, if you a reading a stream that takes a long time, example, tail -f is reading a file, but nothing has appended to that file yet).

Sink Streams

A Sink Stream (aka writable stream) is a function that a Source Stream is passed to. The Sink Stream calls the read function of the Source Stream, abiding by the rules about when it may not call.

Abort

The Sink Stream may also abort the source if it can no longer read from it.

Through Streams

A through stream is a sink stream that returns another source when it is passed a source. A through stream may be thought of as wrapping a source.

Duplex Streams

A pair of independent streams, one Source and one Sink. The purpose of a duplex stream is not transformation of the data that passes though it. It's meant for communication only.

Composing Streams

Since a Sink is a function that takes a Source, a Source may be fed into a Sink by simply passing the Source to the Sink. For example, sink(source). Since a transform is a Sink that returns a Source, you can just add to that pattern by wrapping the source. For example, sink(transform(source)). This works, but it reads from right-to-left, and we are used to left-to-right.

A method for creating a left-to-rihght reading pipeline of pull-streams. For example, a method could implement the following interface...

pull([source] [,transform ...] [,sink ...])

The interface could alllow for the following scenarios...

  1. Connect a complete pipeline: pull(source, transform,* sink) this connects a source to a sink via zero or more transforms.

  2. If a sink is not provided: pull(source, transform+) then pull should return the last source, this way streams can be easily combined in a functional way.

  3. If a source is not provided: pull(transform,* sink) then pull should return a sink that will complete the pipeline when it's passed a source. function (source) { return pull(source, pipeline) } If neither a source or a sink are provided, this will return a source that will return another source (via 2) i.e. a through stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment