Skip to content

Instantly share code, notes, and snippets.

@dherman
Created January 13, 2012 22:50
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dherman/1609202 to your computer and use it in GitHub Desktop.
Save dherman/1609202 to your computer and use it in GitHub Desktop.
using loop controls from within block lambdas
Mailbox.prototype.extractContents = function(contacts) {
let messages = this.messages;
loop:
for (let i = 0, n = messages.length; i < n; i++) {
messages[i].headers.forEach { |header|
if (header.isSpam())
continue loop; // the label is not strictly necessary here
let addresses = header.extractAddresses();
addresses.forEach { |addr|
contacts.add(addr.name, addr.email);
};
};
}
};
@polotek
Copy link

polotek commented Jan 14, 2012

Well I think I should clarify what I see as the pros and cons.

Pros:

  1. block control flow, e.g. return, break, continue (big_)
  2. "this" context stays consistent with enclosing context (big_)
  3. "easier refactoring" i.e. TCP. Can't comment much on this. (?)
  4. shorter block syntax (small, but really nice)

Cons

  1. confusing semantics between the 2 types of block executions, functions vs. blocks. This is a very "weighty" feature. (big)
  2. non intuitive semantics for block control flow. (big, people will really expect break and continue to work with forEach and friends)
  3. comes with additional baggage as people now need to use labels to achieve what they originally wanted. (big, cue the nerd arguments about "goto considered harmful")
  4. You get almost none of the above benefits when you cross asynchronous boundaries (huge, this is why the pros are marked with "*")

For instance. Let's try a less trivial example.

function getData(id) {
  // lets assume some really dumb cache :)
  var cache = getDataCache();
  cache.forEach({ |item|
    if (item.id === id) return item;
  });

  // we didn't find it in the cache, so look it up
  getRemoteData(id, {|data|
    cache.push(data);
    return data;
  });
}

I think this is a very reasonable bit of code. It's the kind that people have been wanting to write in js for years and years. And it has never worked. As soon as getRemoteData makes it's asynchronous call, all the blocks in this code become invalid. You can't count on the nice control flow semantics anymore. When the data call returns and tries to execute the second block, it will throw. Not only will it throw, but there's no way to catch it because the stack has unwound. Only the stack inside of getRemoteData can possibly catch this. But that would be peering inside our black box and assuming too much. And more so, the outcome is confusing because our first block works just fine if it's found in the case. This is a giant headache.

Now you should be aware of when the functions you use may have async boundaries. But this is something that trips people up constantly today, even without blocks. And even if they get it, they hate jumping through hoops to get around it. Blocks do nothing to alleviate this, and in fact I would argue they exacerbate the problem by introducing the false idea of these nice control flow semantics. They seem to promise some respite, but as soon as you try them in a very common scenario, you find that they immediately break down. In fact, I would argue that async boundaries happen often enough in javascript that the usefulness of the blocks in this proposal is greatly reduced because they do not handle it.

So what we end up with is 3 dubious benefits, that only apply to people who write mostly synchronous js code (plus a small syntactic win). And 4 clear disadvantages. I feel the cons outweigh the pros. Now with that being said, I'm not against having a solution. I want to move javascript forward. But if we're debating because we all care about the what constitutes "forward" direction, then this is the landscape as I understand it. And I can't help but think we should be trying to do better.

@rwaldron
Copy link

I'd feel remiss if I didn't point out that some and any exist as "breakable" alternatives to forEach. forEach shouldn't need to be "broken" from if you filter the array being operated upon before passing to forEach.

@rwaldron
Copy link

That point should be taken as a side note to the main discussion ;)

@dherman
Copy link
Author

dherman commented Jan 14, 2012

@polotek: OK, I really have to push back now. Let's address each of your listed cons:

  1. Yes, this is different from the existing function semantics. But you can think of block lambdas as the lower-level primitive on which functions are built. And you can explain these concepts in pieces to programmers, just as you already have to do to teach them JS today. Nobody learns a language all in one go.

  2. As Brendan says, "intuitive" is subjective and cuts both ways. I don't doubt people will get confused by some aspects of whatever semantics we choose. Not understanding a language will always lead to confusion. Let's say we went with the semantics you thought it would have. Then if you have a function that doesn't "seem like a loop" (unlike forEach), if it happens to have a loop (or a switch) in its stack when it calls your callback, your callback will break from that instead of from your own loop.

while (...) {
    thing.foo({ |bar, quux|
        if (...) break; // oops, breaking from something internal to thing.foo!
    });
}
  1. This argument is bogus on multiple levels. First you claim that people need a label where they didn't used to. Try breaking out of a forEach today. You can't unless you use exceptions!

Second, the "goto" complaint is a red herring. Any use of break, continue, or return is a kind of goto. And you can already break to a label in JS today. You just can't do it from within a nested function.

  1. Now this is just totally unfair. Should we blame block lambdas for every problem they don't solve? You can't use block lambdas to return from a function that has already returned.

So like today, you could use either callbacks:

// (id, (data -> result)) -> result
function getData(id, cb) {
    var cache = getDataCache();
    cache.forEach({ |item|
        if (item.id === id) {
            cb(item);
            return;
        }
    });
    getRemoteData(id, { |data| cache.push(data); cb(data) });
}

or promises:

// id -> promise<data>
function getData(id) {
    var cache = getDataCache();
    cache.forEach({ |item|
        if (item.id === id)
            return Promise.now(item);
    });
    return getRemoteData(id).then({ |data| cache.push(data); data });
}

But what I think you're looking for in this example is generators. In the style of task.js:

// id -> promise<data>
function getData(id) {
    return spawn(function*() {
        var cache = getDataCache();
        cache.forEach({ |item|
            if (item.id === id)
                return item;
        });
        var data = yield getRemoteData(id);
        cache.push(data);
        return data;
    });
}

Copy link

ghost commented Jan 14, 2012

If people really want to break out of a forEach so bad, why not just make something like this.break() work in the function body? break and continue are hackish baggage from C that should be used sparingly anyways, not something I would expect to be instrumental to the future direction of the language.

@dherman
Copy link
Author

dherman commented Jan 14, 2012

@substack: Agreed.

@polotek
Copy link

polotek commented Jan 14, 2012

First off, this is my first time really trying to engage with those who are trying to push javascript as a language forward. Brendan Eich has been gracious enough to respond to me on several occasions and I appreciate you doing so as well.

But I think you both are missing my point, and it comes across as dismissive. Because I don't have a background in studying languages, I try not to speak on things I don't know. But I write code every single day. I work with others who write code every day and we share stories about it. The concerns I'm bringing up have to do with confusion and bloat. They have to do with exchanging one set of gotchas in js ("this" context, false control flow with functions), for a different set of gotchas (blocks don't work async, when to use a block or func is easily confusing).

I don't expect blocks to be perfect. But I do expect that when we're designing solutions to various problems, we keep in mind how they affect the language holistically. The average programmer doesn't have deep knowledge of compilers and refactoring principles and whatever. We strive to build a consistent mental model of the language that we can reason about, so we can get work done. When language features violate the principles of least surprise, that mental model has adjust uncomfortably, causing mental overhead. This is generally the cause of vague protests of "non-intuitive" and "doesn't feel javascript".

If you guys are going to try to build consensus for things, it would help to accept the fact that while these are difficult to debate formally, they are real factors that determine developer happiness and productivity. With that in mind, on to your points.

  1. We could use this argument to argue that any crazy, confusing feature is okay. Gaining a clear understanding of the difference between functions and blocks will be critical to writing correct code. And if you are suggesting that this is easy and negligible, you haven't been paying attention to the other things people have been having trouble with.

  2. This is a fair point about "hidden" loops. But I think it further illustrates how confusing this is. When people wanted a closed scope to use with for loops, they were told to use forEach() like a for loop with block scope. Then it was clear this was inadequate because you couldn't break or continue the way you wanted to. So we increase the api surface area with some() and any(). Okay fine. But now we're introducing blocks as an alternative to functions. But it crashes right into people's understanding of what has come before it. "Hey look, you can break and continue and return now". But wait, not with the forEach which we've come to know and love. Forget that, go back to regular for loops. Fail.

  3. The goto argument is not a red herring. Instead, the argument about labels is FUD waiting to happen. People view return, break, continue as "safe" control flow. Because the implications are well understood (or they were). But case in point, nobody uses labels today. Why do you think that is if they're so awesome? Your example with blocks suggests that use of labels will become common. In which case, people will reach for it in other scenarios where it will shoot them in the foot. Fail.

  4. Again, I'm not suggesting block lambda's should solve the async problem. I'm suggesting that when people see the return semantics, they will assume that blocks do solve it. They will fail, then they will blame javascript.

You can't use block lambdas to return from a function that has already returned.

You think this is obvious? With as much trouble as people have figuring out why "return" doesn't work with async today. You think they'll also be fully aware, at all times, of when they can't use blocks? This is doubtful.

Again, I'm trying to be clear what my objection is. It's not that blocks are terrible (I like them). It's not that I don't want to solve the issues that blocks solve (I do). My objection is that this proposal is confusing when placed along side all of the other js quirks that we've been trying to teach people about. They add to the #wtfjs in ways that you and others are totally ignoring. That's fine too. Tell me you don't care and you think it's worth it. But don't tell me I don't know what I'm talking about. I've been talking to people about this for the last week and they all have the same confusion.

Copy link

ghost commented Jan 14, 2012

Crazy idea: what if instead of the loop label for(...) returned a value? Then you could:

var loop = for (let i = 0, n = messages.length; i < n; i++) {
    /* ... */
    loop.continue()
    /* ... */
}

Something like this would side-step the break/continue argument for the necessity of block syntax I think.

Edit: it could also look like this so that loop control access would be scoped to the inner for, preserving the existing continue/break loop semantics:

for loop (let i = 0, n = messages.length; i < n; i++) {
    /* ... */
    loop.continue()
    /* ... */
}

Plus, that's already how named vs anonymous functions work so it's not terribly novel.

@polotek
Copy link

polotek commented Jan 14, 2012

@substack, holy shit do I love that. But it's still got problems. loop.continue() has to be a function, but it's not idiomatic for functions to affect execution flow like that. Also it starts to treat language constructs like the for loop as weird looking function executions. That's also a slippery slope.

Copy link

ghost commented Jan 14, 2012

@polotek Yeah I just came up with it so I haven't thought through the implications much yet. It also doesn't address the async issue but you can't use break or continue in for loops right now anyways I guess.

@polotek
Copy link

polotek commented Jan 14, 2012

Maybe labels would be less harsh if they were more controlled this way. It bothers me that you can put them anywhere (or maybe you can't, I don't really fully understand labels). But how's this.

 for loop (let i = 0, n = messages.length; i < n; i++) {
     /* ... */
     continue loop;
     /* ... */
 }

function getGoodMessages(messages) {
  let goodMessages = [];
  for outer (let i = 0, message = messages[0]; i < messages.length; i++) {
    let headers = message.headers;
    for inner(let j = 0, header = headers[j]; j < headers.length; j++) {
      // if the condition passes we want to abandon this loop and go to the next iteration of the outer loop
      if(header == 'X-Spam') continue outer;
    }
    // if we made it here, none of the headers failed
    goodMessages.push(message);
  }

  return goodMessages;
}

@dherman
Copy link
Author

dherman commented Jan 14, 2012

@polotek: I'm sorry it's daunting to weigh in. I'm not trying to be dismissive. It's not by accident that every aspect of ES.next is open to constant public discussion -- on es-discuss, on gists, on Twitter, at conferences, on IRC, and by email. It's because we deeply value the input of developers who are on the ground using JS every day. There are people who will tell you otherwise. Please don't listen to them.

If I'm arguing hard, it's not because I'm being dismissive, it's because I'm taking you seriously and giving you the courtesy of my most serious responses. But can you also think about your own tone? "You haven't been paying attention" and "Fail" and "FUD" and "#wtfjs" aren't exactly the moral high ground.

I have listened to you, I do hear your points, and I do take them seriously. And I never told you you don't know what you're talking about. I think I'll just leave it at that for now.

@polotek
Copy link

polotek commented Jan 14, 2012

@dherman Fair enough. Please accept my apologies. Tone is important on the internet and I regret using the word Fail. But FUD and #wtfjs are common sentiments with js today and should not be written off as just rash talk. They serve to illustrate the mental overhead that costs us when we aren't consistent with how these features work together. I don't think any of this serves to diminish my argument unless you're already looking to do so.

As for the "not paying attention" comment. I'm sorry for that too. That was clearly a display of passionate frustration at the responses I've been getting about this. I still don't feel like you or anyone else has addressed my concerns about the weight of this feature or the confusion it's likely to cause. JS has a history of causing confusion and I'd say for the devs who love the language, avoiding an increase in this confusion is just as important as adding more power and expressiveness. We can't all be able to use the right words and the right tone at all times to express out concerns. But you've taken on the job of listening to us, so I hope you'll cut us a bit of slack and read through the frustration.

Here's hoping you'll re-engage with my arguments here.

@dherman
Copy link
Author

dherman commented Jan 14, 2012

@polotek: I guess what I'm trying to get to is that there's a difference between "this wasn't what I first expected" and "this will repeatedly bite me." Many of the #wtfjs issues are ones where either there are bizarre special cases in the language that you have to memorize (like coercion rules) or just things that subtly change when you're making common code refactorings (like this). The thing about block lambda is that it's got a very minimal semantics, with very few special cases, in order to avoid exactly these kinds of #wtfjs issues.

Your primary contention is that even if block lambda is nice on its own, the differences of lambda semantics and function semantics will cause confusion, and will lead to a lot of these constant headaches that are familiar to JS programmers. That may be so. I think the best way to find out would be to build implementations before standardizing, and to experiment with writing real code.

@polotek
Copy link

polotek commented Jan 14, 2012 via email

@isaacs
Copy link

isaacs commented Jan 15, 2012

I still don't see why block lambdas are necessary just to be able to break out of a forEach. Why not just solve this by adding an escape hatch to forEach?

Array.prototype.breakableForEach = function (fn, thisp) {
  thisp = thisp || this
  for (var i = 0, l = this.length; i < l; i ++) {
    if (!this.hasOwnProperty(i)) continue
    if (fn.call(thisp, this[i], i, this) === true) break
  }
}

var arr = [1, 2, 3, 4, 5]

loop: arr.forEach({ |n|
  if (n === 4) break loop
})

arr.breakableForEach(function (n) {
  if (n === 4) return true
})

Obviously the advantage of the first over the second is that the first uses more specific syntax. You break out of the loop by doing break loop, which reads quite nicely.

However, the hazards of block lambdas are real, and should not be dismissed. I don't think that this addresses the most important problems in JavaScript, and I think it adds more hazard than it takes away. Because blocks are a valid RHS expression, and can be passed around, or called multiple times, the static scoping semantics introduce a lot of very strange and confusing edge cases.

True, every language feature can be abused. However, some language features are far more prone to abuse than others. (with and eval come to mind.) I think that block lambdas do not provide enough value to justify the hazards of the new semantics that they add.

@isaacs
Copy link

isaacs commented Jan 15, 2012

Update: you can make the es-current "breakableForEach" more readable and less hazardous by doing something like this:

Array.BREAK = {}
Array.prototype.breakableForEach = function (fn, thisp) {
  thisp = thisp || this
  for (var i = 0, l = this.length; i < l; i ++) {
    if (!this.hasOwnProperty(i)) continue
    if (fn.call(thisp, this[i], i, this) === Array.BREAK) break
  }
}

var arr = [1, 2, 3, 4, 5]

arr.breakableForEach(function (n) {
  if (n === 4) return Array.BREAK
})

@granthusbands
Copy link

@dherman: I think the 'wait for implementations' argument is dangerous, in that the 'friction' required for an implementation to change is higher than that needed for the design of a feature to change. That is, one could implement a barely sufficient feature and people would work around its foibles and there wouldn't be sufficient pressure to fix them, but a world without them may still be more desirable.

In the parallel es-discuss thread, fleshing out ideas from many, I've proposed that we support this:

for arr.forEach { |o|
  if (...) break;
}

With the behaviour of break then being equivalent to that for a normal for loop (it can desugar to code that already works under the proposed feature). It allows break to behave as people would expect, when explicitly using block lambdas for looping, which covers some of what has been discussed here.

@bingomanatee
Copy link

Why not just go functional and use the event loop?

Var loop = f(){
If (this.index>this.data.length)
{ return this.callback(null,this);}
Try {

    var item = this.data
     do stuff
    ++ this.index;
      Process.nextTIck(this)
   } catch (e){
     E. fn = this
    This.call ack(e)
  }

Loop.index=0
Loop.data= [.....]
loop();

This is phone text ... But u get the idea

@bingomanatee
Copy link

Break in this pattern is just

 Return process.next(this)

Which though noisier is very explicit

@broofa
Copy link

broofa commented Jan 16, 2012

@dherman wrote:

Second, the "goto" complaint is a red herring. Any use of break, continue, or return is a kind of goto. And you can already break to a label in JS today. You just can't do it from within a nested function.

This is just saying that "every control statement is a kind goto", which I see as a strawman argument. While true, this doesn't justify the proposed use of labels any more than it justifies adding bona-fide goto support.

One thing I'm a bit unclear on is what restraints are placed on the proposed use of labels. E.g. What's to prevent the top-of-gist example from being rewritten more functionally as ...

function extractContents() {
  loop:
  ...
    addresses.forEach(inner);
  ...
}

function inner(addr) {
  ...
    continue loop;
  ...
}

... which is pretty clearly just 'goto' with a different name (and a bad idea for all the same reasons Dijkstra outlined 44 years ago.)

@dherman
Copy link
Author

dherman commented Jan 16, 2012

@broofa:

This is just saying that "every control statement is a kind goto", which I see as a strawman argument.

Oh, not at all -- it's saying that return and break (leaving out continue since it's not the same control-flow pattern as break-to-label) are semi-structured jumps that follow a prescribed protocol, as opposed to unstructured goto, and that breaking to a label is following the same protocol: terminate a computation early and return to the same point you were going to return to anyway. The only difference with break-to-label is that you get to name which computation you want to terminate early, instead of just loops and switch.

In fact, maybe that's part of what's causing people's negative reaction to break-to-label: syntactically, when you see:

L: {
   ...
   break L;
   ...
}

it kind of looks like it's saying go back to point L (i.e., the start of the block) and start over (i.e., goto). But it's not; it's saying jump out of the computation that started at L to its end (i.e., the end of the block). A goto that restarts a computation in a different state is weird and hard to understand. A goto that simply terminates a computation early is much easier to reason about and much more common. In fact, that's what exceptions do. I don't think it's controversial to say that exceptions are more structured than goto.

And you could always use exceptions instead of break-to-label. But this gets heavyweight: you have to come up with a new kind of exception, check for it when you catch it, and rethrow if it's not your special exception.

One thing I'm a bit unclear on is what restraints are placed on the proposed use of labels.

Static scope.

E.g. What's to prevent the top-of-gist example from being rewritten more functionally as ...

The fact that the loop label is not in scope inside inner.

@dherman
Copy link
Author

dherman commented Jan 16, 2012

PS Actually, continue isn't that different a pattern either; it just means break out of the block corresponding to the current iteration, whereas break breaks out of the block corresponding to the whole loop.

@bingomanatee
Copy link

"continue" is just syntactic sugar; it can easily be replaced in most contexts with "if" statements.

Here is a formal iterator. It executes each iteration as a separate event. It has break() methods, and a "continu" method.

You DO have to manually "next()" inside the worker to keep the cycle active. One of these methods -- done, next, continu, break -- have to be called or the loop becomes suspended indefinitely -- and calling more then one of them inside a single execution cycle can create ambiguous situations.

But it does show that continue and break can be implemented without requiring a language change. And that is always good.

      function Iterate(data, worker) {
        this.data = data;
        //@TODO: validate data == array, worker == function
        this._worker = worker;
    }

    Iterate.prototype = {
        next: function() {
            if (index >= this.data.length) return this.done();
            ++this.index;
            var self = this;
            return process.nextTick(function() {
                self.work();
            });
        },
        continu: function(){
            this.next();
        },
        index: 0,
        done: function(status) {
            this.status = status || 'done';
            return this.callback(null, this);
        },

        'break': function() {
            this.done('broken');
        },

        work: function() {
            this._worker.call(this);
        },

        status: 'idle',

        start: function(){
            this.status = 'working';
            this.work();
        },

        item: function(n) {
            if (arguments.length == 0) {
                n = this.index;
            }
            return this.data[n];
        }
    }

    /**
     * A use case.
     * Note - the fact that the worker function only iterates after
     * a record has been saved FORCES this to synchronize writes.
     * In a loop where this is not a priority, this.next() could be called
     * at the end of the work function.
     */

    var iterator = new Iterate(
        [
            {name: "foo", id: 3},
            {name: "bar", id: 4},
            {name: "boo", id: 13},
            {name: "far", id: 14},
            {name: "voo", id: 23},
            {name: "var", id: 24}
        ],

        function() {
            var item = this.item();
            if (item_is_bad(item)) {
                return this.continu();
            }
            if (item.id == 14) {
                return this.break(); // note - MUST manually return.
            }

            var self = this;
            item.gender = '?';
            db.save(item, function(err, saved_item) {
                if (err) {
                    self.error = err;
                    self.done('error');
                } else {
                    self.next();
                }
            });
        });

    iterator.start();

    function item_is_bad(item) {
        if (!item) {
            return true;
        }
        return item.id == 3;
    }

    var db = {
        save: function(r, cb) {
            // save the record;
            cb(null, r);
        }
    }

@isaacs
Copy link

isaacs commented Jan 16, 2012

I'm opposed to block-lambdas, but @dherman is right: the "this is a goto" argument is not a valid response to block-lambdas, and the "every control statement is a goto" is not the reason why. Every control statement is, in some way, a method for sending the execution to another point in the program (ie, a goto), but, with specific structured semantics. Gotos lack that structure, which is the source of their hazard. When people say "X is a goto", it derails the conversation into "But everything is a goto" vs "No it's not, because it has some structure", when really, we should be discussing whether the specific structure in this case is a) relevantly different from what already exists, and if so, b) whether the added power is worth the added conceptual complexity.

For me, the issue isn't that block lambdas are a goto, but rather that the structured semantics of block lambdas is hazardous. They are a goto that is passable (like a function), and which has access to the data in the scope where it's defined (like a function), but, also has control-flow effects in the context where it's defined (like a loop).

As I showed in this gist, it can lead very quickly to some very odd scenarios. If the goal is to provide breakable iteration constructs, then we can do that more safely with API today, which in my opinion obviates the need for doing that in syntax.

@dherman
Copy link
Author

dherman commented Jan 16, 2012

@isaacs: I agree that "X is a goto" is a bogus argument, but to be clear no one was arguing that block lambdas are goto. They were arguing that break-to-label is as bad as goto, because I pointed out that you could use your own labels to break out of a forEach.

Anyway I don't really understand your gist. How is it any more of an issue than a function that throws an exception?

@isaacs
Copy link

isaacs commented Jan 16, 2012

@dherman I've heard "block lambdas are a goto" as a nay argument before. (I agree with the "nay" bit of it, but not with the reasoning.) I'll respond to your comment about the other gist in the other gist's comments.

@bingomanatee
Copy link

bingomanatee commented Jan 16, 2012 via email

@grncdr
Copy link

grncdr commented Jan 17, 2012

I commented more verbosely on Isaac's gist, but in short: what about disallowing mixing of block lambda return and normal return in a single function?

@polotek
Copy link

polotek commented Jan 17, 2012

I just want to point out that this is what I mean about goto and FUD. Not that labels are true gotos, not that any form of goto is harmful. But that people will want to argue about it. We will be spending a lot of our time correcting people's misconceptions. And in the end, the consensus will be, "look just don't use labels, you don't need them if you know how to write good javascript". They are not a useful feature of javascript and they should be avoided. We've managed to do it for this long. Let's not falter now.

Yes I do realize that I'm the one that first mentioned goto. Call it a preliminary social experiment that served to fully support my hypothesis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment