Skip to content

Instantly share code, notes, and snippets.

@fakedrake
Last active September 11, 2015 19:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fakedrake/e39ec4ab714e36a1e6ae to your computer and use it in GitHub Desktop.
Save fakedrake/e39ec4ab714e36a1e6ae to your computer and use it in GitHub Desktop.

Gazing into the abyss of Javascript asynchrony

Albeit master of none, Javascript is a true jack of all trades. It can wear the hat of object orientation, of functional programming, event driven and even procedural under certain circumstances. The defining feature however is it’s invariably non-blocking scheme.

Let’s start with some terminology. Javascript is said to be asynchronous and non-blocking. The terms “asynchronous” and “non-blocking” are used interchangeably in this context. They mean that a system will never wait for anything. It will instead register an event to be triggered when the otherwise delaying factor finishes. That is in contrast to blocking systems that will pause after attempting to interact with an external resource for in wait for the operation to finish before continuing. There are chiefly two cases where the asynchronous scheme becomes crucial for Javascript.

  • When a task depends on user input. Eg when the “Flash on arduino” button is clicked, the browser starts interacting with the device but we can’t freeze the entire platform in wait for this particular button to be pressed.
  • When a task depends on external IO. Eg send the user’s code to the compiler server but don’t just wait for the response, continue with the rest of the website’s functionality, and when the response arrives begin the flashing process.

Closures

https://hopefuldisasters.files.wordpress.com/2015/06/closure.jpg

Let’s talk about the task themselves first. A task is nothing more than a block of Javascript code and a set of references to the variables that that block has access to. For example:

var a = "Scheduled", b = "Not scheduled";
setTimeout(function () {
  console.log(a);
}, 10);

console.log(b);

This code schedules a task to be executed in 10ms. That task has access to all variables that are in scope when the function was declared, that is a and b. This task is also called a closure and it is a pretty common pattern in functional languages. The feature that we just described is called lexical scoping which is supported by Javascript, as opposed to dynamic scoping which is found on less functionally oriented languages like python or ruby.

The above code should output:

Not scheduled
Scheduled

And that makes perfect sense, since we schedule the output of the value of a, then we output the value of b and 10ms later the scheduled.

Message queue

http://s3.vidimg.popscreen.com/original/7/eHFoeGE0MTI=_o_mr-bean---back-of-the-hospital-queue.jpg

Consider now the following example:

var a = "Scheduled", b = "Not scheduled";
setTimeout(function () {
  console.log(a);
}, 0);

console.log(b);

This outputs again:

Not scheduled
Scheduled

That may seem slightly strange. We scheduled the first log to be executed immediately. But the following command is executed first. That is because “immediately” actually means “when you get the time”. Let us examine a demonstrative usecase from Facebook:

  • The user writes a comment for a Facebook post but does not yet click “Post comment”
  • A new comment is received
  • Two likes are received by the server for that post
  • The user clicks “Post comment”
  • The comment is rendered

Javascript maintains a message queue and puts in there any task it needs to accomplish. It always processes the first task and pushes new tasks to the back of the queue. Therefore what actually happens is:

  • The received comment starts to render
  • The likes are received and their rendering is queued
  • The “Post comment” button is clicked and comment posting is queued
  • The comment rendering finishes
  • Each of the likes is rendered
  • The user’s comment gets posted.

https://dl.dropboxusercontent.com/u/22317129/diagram.png

Of course this is a rather simplified interpretation of what is going on, and a modern computer will make all the above seem instantaneous.

Putting all together

http://www.quickmeme.com/img/7b/7b338e4415085aa045134db0e6416067bf9e9c527732187ec6fcb69c3f38a19c.jpg

Here is a more complex example:

var nextTask = 0, outOfOrder = false;
for (var i = 0; i < 1000; i++) {
  (function (i) { // Per task specific scope
    setTimeout(function task () {
      if (nextTask++ != i) {
        console.log("Out of order!", i, nextTask);
        outOfOrder = true;
      }
    },0);
  })(i);  // end of per task scope
}

if (!outOfOrder) {
  console.log("All functions were called in the order they were scheduled");
}

What is going on here:

  • 1000 tasks are scheculed and each one gets assigned an id (i) that represents the sequence in which it was defined.
  • Each task compares a shared value nextTask with it’s id.
  • nextTask is incremented thus predicting which will be the next task to be executed.
  • If all 1000 tasks can predict the next task correctly, it is a strong indication that Javascript has a very deterministic manner of scheduling tasks (spoiler alert: they do).

There are multiple points to be made about this piece of code, notice first the pattern (function (i) {...})(i) in the for loop. What happens here is we create a bunch of functions, each of them named task. Due to lexical scoping they all have access to the variables nextTask, outOfOrder, and i because they all have the same outer scope. When a function is defined it doesn’t keep a copy of the outer scope, it keeps a reference to it. Therefore not only are changes visible from inside the function but the scope is mutable by the function itself. This way we can read and write to the variable nextTask from each task and have that change be visible to the rest of the tasks. The same however applies to i.

Each function gets a reference to i and then i is changed. Therefore by the time a task gets executed the value of i is different than it was when the function was defined. That is not what we want here. We want each function to have a separate value for i, but to share the value of nextTask. The way we get around this problem is by creating a scope specific for each one of the tasks. Before each task function is defined a local scope is created to accomodate it’s arguments and locally declared variables. Thus we do function (i) { /*scope*/ } and then we call it copying the current value of i to the argument list. Within the context created this way we define our task that has now access to a copied version of i and a shared reference to nextTask. Once the task is defined and scheduled the copy we created for i gets out of scope and now the only references to the crated =i=s left are the closures of each task. Voila! Each task has it’s own personal copy of i.

Back to the scheduling issue: the output of this code snippet as you may have guessed is

All functions were called in the order they were scheduled

Whenever the timer of a setTimeout defined task expires, it gets pushed to the back of the message queue. When the currently running code block finishes, a message is taken from the message queue and processed. In our case we push 1000 functions to the message queue and then each one of them gets executed in the order they were pushed there. The exact same thing happens when an event triggers a function: The function is pushed to the back of the queue and waits for it’s turn to be executed.

Wraping up

http://www.yenmag.net/wp-content/uploads/2013/05/etsy-candles.jpg

Now with some interesting implications and caveats of this model of concurrency:

Background tabs will clamp timers to 1s

If you don’t develop for the browser or if you only depend on timer precision for animations then you probably won’t even notice that once a tab becomes inactive all setTimeout and setInterval calls clamp the timer to 1000ms. We will talk about how to get around that in a future post.

Scheduling tasks allow the system to be more responsive

When executing computationally intense tasks it is usually better to break them up into different tasks and throw them individually in the queue to keep the rest of the system responsive. For example you may want to consider replacing something like:

enormousText.replace(/[\.,?"':;!)( ]+/g, " ").split(' ').some(spellChek)

with something along the lines of

function checkText (enormousText, cb) {
  setTimeout(function checkFirstSentence () {
    var splitText = enormousText.split('.', 1);  //Separate the first sentence

    // If a word was misspelled stop
    if (splitText
        .replace(/[\.,?"':;!)( ]+/g, " ")
        .split(' ')
        .some(misspelledWord)) {
      cb(false);
      return;
    }

    // Check if that was the last sentence
    if (splitText[1] || splitText[1] == 0) {
      cb(true);
      return;
    }

    // Continue
    checkText(splitText[1], cb);
  })
}

The secont version checks one by one the sentences of the text in separate tasks. It may be more complex but if during the computation an event occurs (eg a mouse hover) it will be served after the current sentence spellcheck instead of waiting for the whole text to be processed. Another side effect is that checkText is a higher order function now, ie it accepts a function (cb) as an argument to emit the computation result.

Prefer callbacks to return values

Javascript functions are first class citizens in the sense that they can be used as data in the same way integers, strings and objects can. Using return values instead of callbacks is usually a bad idea. Even if the current implementation of your function can do all it’s work within a single task, you never know when you will want to delegate the computation to the server or apply the above technique to relieve responsiveness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment