Skip to content

Instantly share code, notes, and snippets.

@jkrems jkrems/generators.md
Last active Sep 11, 2018

Embed
What would you like to do?
Generators Are Like Arrays

In all the discussions about ES6 one thing is bugging me. I'm picking one random comment here from this io.js issue but it's something that comes up over and over again:

There's sentiment from one group that Node should have full support for Promises. While at the same time another group wants generator syntax support (e.g. var f = yield fs.stat(...)).

People keep putting generators, callbacks, co, thunks, control flow libraries, and promises into one bucket. If you read that list and you think "well, they are all kind of doing the same thing", then this is to you.

There are three distinct categories that they fall into:

  1. Models/Abstractions of async behavior
  2. Control flow management
  3. Data structures

Generators are category 3. Genators are like arrays. Don't believe me? Here's some code:

function *doStuff() {
  yield fs.readFile.bind(null, 'hello.txt');
  yield fs.readFile.bind(null, 'world.txt');
  yield fs.readFile.bind(null, 'and-such.txt');
}

What does it do? Not much actually. It just creates a list of functions that take a callback. We can even iterate over it:

for (task of doStuff()) {
  // task is a function that takes a good ol' callback
}

Written down using boring, ES5 code:

function doStuff() {
  return [
    fs.readFile.bind(null, 'hello.txt'),
    fs.readFile.bind(null, 'world.txt'),
    fs.readFile.bind(null, 'and-such.txt')
  ];
}

Ready to get your mind blown? We can use the exact same for-loop snippet to iterate over this.

Now, of course generators aren't just a slightly more verbose array syntax. They allow you to dynamically alter the content of the array based on stuff being passed in or to return lazy (read: infinite) sequences. All this can be done in ES5 already (regenerator is proof of that), but generators do offer a nicer syntax.

But they aren't async. And they don't manage control flow. Would you say that an array is "control flow management" because you can pass it into async.series?

async.series(doStuff());

Then why would you call generators "control flow management" because you can pass one into co?

co(doStuff());

Sure, generators are a more powerful "data structure" than arrays. But they are still closer to arrays than they are to promises, caolan/async, or callbacks.

If you want to do something async, you need category 1 (an abstraction of async behavior). To make it nicer, category 2 (control flow management) can be helpful. And more often than not category 2 will require you to use known data structures from category 3. You can pick your poison for category 1 and 2 freely. But you won't be able to replace a promise with a fancier array.

  1. Models/Abstractions for async behavior: thunks+callbacks, promises
  2. Control flow management: co, async, spawn/ES7 async functions, Promise.all
  3. Data structures: arrays, generators, objects/maps

P.S.: I hope this post can usher in an era of JS developers using all sorts of different, slightly weird analogies to explain an often misunderstood language feature. Then we finally got our own little "Monads are like X".

P.P.S: The right choice is obviously Promises + async functions.

@rlidwka

This comment has been minimized.

Copy link

commented Dec 5, 2014

People keep putting generators, callbacks, co, thunks, control flow libraries, and promises into one bucket.

Because they all (except for control flow libraries of course, co included) are data structures used primarily as async abstractions. Well, instead of callbacks we should say "functions implementing the callback(err, data)" interface, but you get the idea.

I think pretty much all people who are reading this will use generators to yield functions for a control flow library to manage. And generator is the same as a promise in this regard, because control flow libraries support yielding both.

So thanks for the explanation of something we already know. But those people who say "well, they are all kind of doing the same thing" are correct. It's now what generators are, but it is how they will use them.

@jkrems

This comment has been minimized.

Copy link
Owner Author

commented Dec 5, 2014

So thanks for the explanation of something we already know.

You're welcome. :)

@dominictarr

This comment has been minimized.

Copy link

commented Dec 5, 2014

Generators are more like streams than promises/callbacks.

@davedx

This comment has been minimized.

Copy link

commented Dec 5, 2014

Generators. How do they work?!

@MikeFielden

This comment has been minimized.

Copy link

commented Dec 5, 2014

Generators == magic i think 😄

I hate to say this but I kind of agree with both sides. Its just a matter of how the community uses generators right? @jkrems are you saying dont use generators for control flow in the same way you wouldnt use arrays for control flow?

@getify

This comment has been minimized.

Copy link

commented Dec 5, 2014

Totally agree generators are not, in and of themselves, async. Generators are async-capable, is how I would describe it. But totally disagree that they aren't capable of expressing flow-control -- they absolutely are!

Let me illustrate both points separately:

function *main(y) {
   var x = y * (yield foo());
   var z;
   try {
      z = x / (yield baz());
   }
   catch (err) {
      z = x;
   }
   console.log(x,y,z);
}

From the perspective of what's going on inside of *main(), the code does in fact express a primitive sort of flow control. In fact, it's impossible to tell from looking only at that code whether it's sync or async, but what we do know is that the yield provides a hint where asynchronicity can occur. This is flow control. Also, the try..catch is flow control, since either synchronously or asynchronously an external influence could change the path of code execution inside *main(..).

Now, if foo() and baz() are in fact sync and return immediate values, and the *main(..) above is run like this:

// run 'main()' to completion
var it = main(42);
for (var v, ret; (ret = it.next(v)) && !ret.done;) {
   v = ret.value;
}

...then of course *main(..) is essentially just a synchronous function.

But, if you make no changes whatsoever to the internals of *main(), and instead fiddle with what foo() and baz() do (making them asynchronous, promise-returning), and you run *main() not with a loop as shown but with a more sophisticated utility (as many promise libs have, like Q.spawn(..), etc), now all of a sudden the same *main(..) function has become an asynchronously completing function, and the yield and try..catch can kick in to handle the pause/resume/error flow control.

yield and try..catch allow the generator to be async-capable, if you so decide. In fact, I think that's the genius of generators, is that they separate the flow control logic (which is expressed in a nice, natural, synchronous way) from the implementation details of how the generator is run to completion (which can either be synchronous or asynchronous). Generators let you think about each of those two pieces independently.

When you're thinking about flow control, you think and author code in a naturally sequential sync-looking way. When you want to control the implementation details of how the generator is run to completion, you ignore its actual code and worry only about what mechanisms are used (sync immediate values or async promises) to flow through the flow-control.


Now, there's no question that yield and try..catch are a limited form of flow control. They are not, in and of themselves, nearly as capable as promises are at flow control.

But that's why putting promises together with generators (plus the runner utility) is so great, because it lets you get the best of both worlds. Promises solve all the pesky IoC issues that callbacks-only code suffers from, AND they have more expressive flow control, such as abstractions like Promise.race(..) or Promise.all(..). Generators provide a capability to separate out the flow control expression from the implementation of the completion.

For example:

function *main() {
   var x = foo();
   var y = baz();
   var z = yield Promise.all(x,y); // look ma, flow control!

   var r = [
      // even more flow control!
      (x = yield bam(x)),
      (y = yield bam(x,y)),
      (z = yield bam(x,y,z))
   ];

   // yet more flow control
   console.log( yield asyncMap(r, bar) );
}

run(main);

That "magical" combination where promises improve the flow-control expressivity of generators... is, I think, why we already see the ES7 async / await on such a solid track even before ES6 is fully ratified. We've already realized that both promises and generators offer parts of the "solution to JS async coding", and putting them together is our best path forward.

@jkrems

This comment has been minimized.

Copy link
Owner Author

commented Dec 5, 2014

@getify @MikeFielden See my P.P.S. - I love ES7 async functions. Go nuts with using generators in combination with promises (or callbacks)! I also wouldn't say "never use Arrays for expressing control flow" - Promise.all, async.parallel all are great tools. The only thing I object to is to pretend that you can somehow use generators without any other async abstraction and "replace" the need for Promises with generators. See the original issue where it was pretended that an API has to "choose" between implementing a promise- or generator-interface. That's a false choice. A "generator-interface" is a promise interface.

EDIT: Just in case someone is reading that paragraph without the context - I'm not suggesting that you can use promises everywhere generators are useful. I'm saying that yield fs.stat('my-file.txt') is a promise interface being used via a (fake) async function and isn't proof for needing a "generator interface".

@sgoguen

This comment has been minimized.

Copy link

commented Dec 5, 2014

I think the word you're looking for is that generators are enumerable, in that generators support an enumerating interface like arrays. Apart from that, they diverge in many ways.

@jmar777

This comment has been minimized.

Copy link

commented Dec 5, 2014

I went into more detail in this in a blog post, but more succinctly, the big deal with generators and asynchrony can be illustrated in a short example like this:

function* helloWorldGenerator() {
    yield 'hello';
    yield 'world';
}

var hw = helloWorldGenerator();
console.log(hw.next()); // prints { value: 'hello', done: false }
setTimeout(function() {
    console.log(hw.next()); // prints { value: 'world', done: false }
}, 1000);

As stated in the post I referenced:

we have a full one-thousand beautiful milliseconds between yield 'hello' and yield 'world', and yet those lines of code are written in a very synchronous-looking syntax. This is a big deal: generators finally provide us with a pseudo-synchronous syntax that doesn't break run-to-completion semantics, doesn't require transpiling, and doesn't require callbacks.

Before generators, it didn't matter if you were using callbacks, continuations, events, Promises, etc., you always had to supply a function somewhere to get control back after an asynchronous operation. Whether you view it as a hacky solution or not, generator-based control-flow libraries like suspend and co really do change things.

Especially when coupled with Promises, they enable valid ES6 code to very closely resemble what (speculative) ES7 code will look like:

// ES6
suspend(function*(id) {
    var user = yield User.findById(id).fetch();
});

// ES7
async function(id) {
    var user = await User.findById(id).fetch();
}
@jkrems

This comment has been minimized.

Copy link
Owner Author

commented Dec 5, 2014

@jmar777 Well, that example is not really convincing...

var helloWorldGenerator = [ 'hello', 'world' ];
var hw = helloWorldGenerator[Symbol.iterator](); // guessing the details here
console.log(hw.next()); // prints { value: 'hello', done: false }
setTimeout(function() {
    console.log(hw.next()); // prints { value: 'world', done: false }
}, 1000);

Btw. - see my P.P.S.: I'm a big fan of async functions. I'd rather use Bluebird (or 6to5) than co for them because they don't have the baggage that co has, which started of with that silly thunk thing. Nowhere in my article do I say that you can't use generators in combination with promises to do awesome things. I only said that you can't replace the promise-part with generators.

@inikulin

This comment has been minimized.

Copy link

commented Dec 6, 2014

Hi there!

There are three distinct categories that they fall into:

  1. Models/Abstractions of async behavior
  2. Control flow management
  3. Data structures

"Models/Abstractions of async behavior" are made of "Control flow management" and "Data structures". Therefore, there are two distinct categories.

Generators are category 3. Genators are like arrays.

No, they are not. Generators are routines ("semicoroutines" to be precise) and they don't have underlying data. They just provide the strategy to iterate over the quantity of any nature. So, generators fall in the category 2 in your classification.

Summarizing, I think your argument is built on top of false assumptions. Using generators for the async control flow is completely OK, since they are weak case of the coroutines which are established as the good control flow primitive to deal with async behavior.

@domenic

This comment has been minimized.

Copy link

commented Dec 6, 2014

Generators are like arrays; generator functions are functions which have syntactic constructs that help you build generators. (Normal functions can only "build" a single value or exception; generator functions can build a generator, i.e. an iterable.)

@inikulin

This comment has been minimized.

Copy link

commented Dec 6, 2014

@domenic
Spec doesn't agree with you:

First-class coroutines, represented as objects encapsulating suspended execution contexts (i.e., function activations). Prior art: Python, Icon, Lua, Scheme, Smalltalk.

Yes, generator function produces iterator, but it's definitely not array, the closest analogy is the singly linked list. No random access, no access to the quantity size while iterating. Moreover, iterable quantity can be lazy evaluated, this makes it distinct from the linked list. So, I think that generators are arrays is the very misleading statement.

@jkrems

This comment has been minimized.

Copy link
Owner Author

commented Dec 6, 2014

@inikulin The post doesn't say "generators are arrays". It compares them with arrays, in the context of async control flow management. It even explicitly says "of course generators aren't just a slightly more verbose array syntax".

@jmar777

This comment has been minimized.

Copy link

commented Dec 6, 2014

@jkrems I'm not sure I understand your response; it doesn't address my comments regarding the significance of having a pseudo-synchronous syntax for async operations (i.e., getting control back from an async operation w/out a function being passed around somewhere).

@jkrems

This comment has been minimized.

Copy link
Owner Author

commented Dec 7, 2014

@jmar777 Not sure I understand - you are passing hw around which contains references to the next actions to take. Or are you talking about Promises?

EDIT: If you just wanted to say that you like async functions (Promises + spawn + generators) then I'm not sure how that is relevant as a "rebuttal" of this gist. It's the exact thing I call out as my favorite in the P.P.S..

@jmar777

This comment has been minimized.

Copy link

commented Dec 8, 2014

@jkrems Ahh, I wasn't really intending to rebut anything in your post; just attempting to clarify / add some context around why generators (combined with a runner like co or suspend) were immediately latched onto for control-flow management.

The example with hw has nothing to do with control flow on its own, it's just an example of how generator functions behave (and how there's some syntactical significance to the fact that a yield expression can span multiple turns on the event loop). Definitely nothing about Promises in that example.

@hollowdoor

This comment has been minimized.

Copy link

commented Sep 23, 2015

@jkrems Technically generators are not like arrays because yield is a type of return statement.

Semantically generators are like arrays because you can loop them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.