tabatkins/pipeline.md

## pipeline.md

      
    Raw
  

              pipeline.md
            
          
    We've been deadlocked for a while on the pipeline operator proposal, for a few reasons.
Partially it's just low implementor interest, as this is fundamentally just syntactic sugar,
but also because there are three competing proposals
and the proponents of each haven't been convinced by the others yet.
In this essay I hope to briefly outline the problem space,
summarize the three proposals,
and talk about what's gained/lost by each of them.
(Spoiler: they're all nearly identical; we're arguing over very small potatoes.)
What is the Pipeline Operator?

When you call a JS function on a value, there are currently two fundamental ways to do so:
passing the value as an argument (nesting the functions if there are multiple calls),
or calling the function as a method on the value (chaining more method calls if there are multiple).
That is, three(two(one(val))) vs val.one().two().three().
The first style, nesting, is generally applicable - it works for any function and any value.
However, it's difficult to read as the nesting increases:
the flow of execution moves right-to-left,
rather than the left-to-right reading of normal code execution;
if there are multiple arguments at some levels it even bounces back and forth,
as your eyes jump right to find a function name and left to find the additional arguments;
and editting the code afterwards can be fraught
as you have to find the correct place to insert new arguments
among many difficult-to-distinguish parens.
The second style, chaining, is only usable if the value has the functions designated as methods for its class.
This limits its applicability,
but when it applies,
it's generally more usable and easier to read and write:
execution flows left to right;
all the arguments for a given function are grouped with the function name;
and editting the code later to insert or delete more function calls is trivial,
since you just have to put your cursor in one spot and start typing
or delete one contiguous run of characters with a clear separator.
The benefits of method chaining are so attractive
that some very popular libraries contort their code structure specifically to allow more method chaining.
(jQuery being the elephant in the room,
as it's still the most popular JS library in the world,
and its core design is a single über-object with a jillion methods on it,
all of which return the same object type so you can continue chaining.)
The Pipeline operator attempts to marry the convenience and ease of method chaining
with the wide applicability of function nesting.
The general structure of all the pipeline operators is
val |> one() |> two() |> three(),
where one, two, and three are all ordinary functions that take the value as an argument;
the |> glyph then does some degree of magic to "pipe" the value from the LHS into the function.
The three pipeline proposals just differ slightly on what the "magic" is,
and thus on precisely how you spell your code when using |>.
Proposal One: F#-style

In this proposal, matching the F# language's pipeline syntax,
the RHS of the pipeline is an expression that must resolve to a function,
which is then called with the LHS as its sole argument.
That is, you write val |> one |> two |> three to pipe val thru the three functions;
the general syntax transform is left |> right => right(left).
Pro: The restriction that the RHS must resolve to a function lets you write very terse pipelines
when the operation you want to perform is already a named function.
Pro: Only one new bit of syntax needs to be minted, the |> itself.
(The others require a placeholder syntax as well.)
Con: The restriction means that any operations that are performed by other syntax
must be done by wrapping the operation in an arrow func:
val |> x=>x[0],
val |> x=>x.foo(),
val |> x=>x+1,
val |> x=>new Foo(x)
etc.
Even calling a named function requires wrapping,
if you need to pass more than one argument:
val |> x=>one(1, x).
Con: The yield and await keywords are scoped to their containing function,
and thus can't be handled by the arrow-func workaround from the previous paragraph.
If you want to integrate them into pipeline at all
(rather than requiring the pipeline to be paren-wrapped and prefixed with await),
they need to be handled as a special syntax case:
val |> await |> one to simulate one(await val), etc.
Proposal Two: Hack-style

In this proposal, matching the Hack language's pipeline syntax,
the RHS of the pipeline is an expression containing a special placeholder variable,
which is evaluated with the placeholder bound to the LHS value.
That is, you write val |> one(#) |> two(#) |> three(#) to pipe val thru the three functions;
the general syntax transform is left |> right(#) => right(left).
Pro: The RHS can be any expression, and the placeholder can go anywhere any normal variable identifier could go,
so you can pipe to any code you want without any special rules at all:
val |> one(#) for functions,
val |> one(1, #) for multi-arg functions,
val |> #.foo() for method calls
(or val |> obj.foo(#), for the other side),
val |> # + 1 for math,
val |> new Foo(#) for constructing,
val |> await # for awaiting promises,
etc.
Con: If all you're doing is piping thru already-defined unary functions,
it's slightly more verbose than F#-style since you need to actually write the function call syntax,
adding a (#) to it.
Proposal 3: "Smart Mix"

In this proposal, mixing the previous two proposals,
the RHS of the pipeline is either an expression containing a special placeholder variable,
which is evaluated with the placeholder bound to the LHS value,
or a bare (dotted-)ident which must resolve to a function,
which is called with the LHS as its sole argument.
That is, you write val |> one |> # + 2 |> three.foo to pipe val thru the expressions;
the general syntax transform is either left |> right(#) => right(left) or left |> right => right(left).
Pros: It can handle all Hack-style cases with identical syntax,
and additionally can handle most F#-style cases with identical syntax.
Con: If you're doing any higher-level programming to produce a function which'll eventually get called,
you do need to use the placeholder;
you can't rely on F#-style evaluation.
That is, if you have a logger(fn) function that modifies a function to log its args and return value,
you must pipe it like val |> logger(one)(#);
writing val |> logger(one) is a syntax error.
(This avoids garden-path issues,
where it's not clear which syntax you're dealing with
until you either spot the # or establish that there isn't one.
Instead, the F#-style cases are trivial to spot because they're so restricted in syntax,
and all the rest are Hack-style.)
Con: As a result of trying to split the difference between the two syntaxes,
it ends up with a more complex mental model.
If a pipeline RHS is currently written in F#-style,
and you realize you want to do something else to it,
you have to recognize that it must be shifted to Hack-style syntax.
(Luckily it's a syntax error if you forget,
rather than just evaluating to the wrong thing,
so at least it's easy to recognize after-the-fact that this is needed.)
What's The Difference?

Ultimately... very little.
As long as F#-style has the special syntax forms for await and yield,
then all the proposals can handle all the same cases with nearly identical syntax.
They just pay a very small syntax tax for some cases vs the other proposals.
Specifically, F#-style is optimized solely for calling unary functions.
Anything else pays a syntax tax of three characters over what Hack-style would do,
with an x=> prefix to introduce the arrow-function wrapper.
(This assumes the parsing issues are resolved to allow arrow-funcs as RHS without needing parens;
this isn't certain to happen, as it involves some significant tradeoffs.
If it doesn't happen, the tax is actually five characters, placed both before and after your code.)
Hack-style is optimized for arbitrary expressions,
but if you're calling a named unary function,
it pays a syntax tax of three characters over what F#-style would do,
with a (#) suffix to actually invoke the function.
"Smart Mix" only pays a syntax tax (a (#) suffix, just like Hack-style)
if it wants to invoke an anonymous unary function
produced as the result of an expression.
However, it pays a mental tax of having two separate syntax models.
...and that's it. That's all the difference is between the proposals.
A three-character tax on RHSes in various situations.
My Preferences

Over time, I've become strongly in favor of Hack-style pipelines.
I think that the case of "unary function" will in general be less common than "everything besides unary functions",
so it makes more sense to put the tax on the first category rather than the second.
(In particular, method-calling and non-unary function-calling are big cases
that will never not be popular;
I think those two on their own will equal or exceed the usage of unary functions,
let alone including all the other syntax that Hack-style can do without a tax.)
I also think that why the tax is invoked makes Hack-style more usable;
the syntax tax of Hack-style (the (#) to invoke the RHS) isn't a special case,
it's just writing ordinary code in the way you normally would without a pipeline.
On the other hand, F#-style requires you to distinguish between "code that resolves to an unary function"
and "anything else",
and remember to add the arrow-function wrapper around the latter case.
val |> foo + 1 is still a syntactically valid RHS,
it'll just fail at runtime because the RHS isn't callable.
You can avoid having to make this recognition by always wrapping it in an arrow func,
but then you're paying the tax 100% of the time
and effectively just writing a slightly more verbose Hack-style.
This is also why I'm against "Smart Mix" -
if you use the bare-ident F#-style case,
and you realize you actually need to do anything more complicated,
you need to remember to do a slight code rewrite
beyond the actual change you want to make.
(Tho it's a syntax error, not a runtime error, when you forget,
so it's easier to catch.)
You can avoid it by always writing Hack-style,
but then there's no point to the mixture.
(Tho at least if you do so,
you're not paying the tax that F#-style does in all the other cases.)
All that said, the benefits of pipeline are so nice
(again, part of the reason for jQuery's success was the syntax niceness of method chaining!)
and the costs of each syntax's non-optimal case so low,
that I'd happily take F#-style over nothing.
So my preferences are Hack-style > Smart Mix > F#-style >>>> not doing pipeline at all.