Skip to content

Instantly share code, notes, and snippets.

@bjouhier
Created April 11, 2012 20:01
Show Gist options
  • Save bjouhier/2362015 to your computer and use it in GitHub Desktop.
Save bjouhier/2362015 to your computer and use it in GitHub Desktop.
streamline vs. callbacks bench
"use strict";
var fs = require('fs');
var cache = {}, hit = 0, missed = 0;
function load(name, cb) {
var res = cache[name];
if (res) {
process.nextTick(function() {
hit++;
cb(null, res);
});
} else {
fs.readFile(name, function(err, data) {
missed++;
cb(null, cache[name] = data);
});
}
}
var count = 1000000;
function bench(cb) {
var total = 0;
function loop(i) {
if (i === count) {
cb(null, total);
} else {
load(__dirname + '/benchCallbacks.js', function(err, data) {
if (err) return cb(err);
total += data.length;
loop(i + 1);
})
}
}
loop(0);
}
var t0 = Date.now();
bench(function(err, result) {
if (err) throw err;
console.log('hit=' + hit + ', missed=' + missed + ', result=' + result);
console.log('elapsed: ' + (Date.now() - t0));
});
"use strict";
var fs = require('fs');
var cache = {}, hit = 0, missed = 0;
function load(name, _) {
var res = cache[name];
if (res) {
hit++;
return res;
} else {
missed++;
return cache[name] = fs.readFile(name, _);
}
}
var count = 1000000;
function bench(_) {
var total = 0;
for (var i = 0; i < count; i++) {
var res = load(__dirname + '/benchCallbacks.js', _);
total += res.length;
}
return total;
}
var t0 = Date.now();
var result = bench(_);
console.log('hit=' + hit + ', missed=' + missed + ', result=' + result);
console.log('elapsed: ' + (Date.now() - t0));
@bjouhier
Copy link
Author

Results (macbook pro / lion):

$ _node --fibers benchStreamline
hit=999999, missed=1, result=798000000
elapsed: 325
$ _node benchStreamline
hit=999999, missed=1, result=798000000
elapsed: 1705
$ node benchCallbacks
hit=999999, missed=1, result=798000000
elapsed: 3252

@thejh
Copy link

thejh commented Apr 11, 2012

Why process.nextTick in the plain JS example? If you document it correctly, it should work without, too.

@thejh
Copy link

thejh commented Apr 11, 2012

if (err) cb(err); in plain JS looks wrong to me - why no return?

@bjouhier
Copy link
Author

Try it without process.nextTick !!!

The other one is a typo but it won't make any difference. I'm fixing it.

@thejh
Copy link

thejh commented Apr 11, 2012

Ahaha, a stack overflow :D

@creationix
Copy link

Well done Bruno! Though the process.nextTick can obviously be worked around, streamline does somehow in it's generated js. ;) I guess the point is that sometimes C compilers write better assembly than humans.

@creationix
Copy link

Hmm, I just noticed you're not using the same bench function. That's kinda cheating. It's the recursive nature of the first bench function that requires using process.nextTick there.

@bjouhier
Copy link
Author

A bit of explanation:

  • Streamline in callback mode: The loop is transformed into a recursive form by the streamline compiler. So I'm not cheating here. But streamline does not use nextTick to avoid the stack overflow. It uses a trampoline instead, and that turns out to be faster. Early versions of streamline used nextTick but this was not consistent with the semantics I wanted to have because it broke the principle that all yielding points remain explicitly marked with an _. This is why I went with a trampoline.
  • Streamline in fibers mode: the loop is not transformed. The first call to the load function takes a performance hit because it creates a fiber, yields, etc. but the 999,999 subsequent calls execute as direct JS calls. This explains why the fibers mode wins by 10 to 1. But I would not call this cheating because the whole point about fibers is that it does not force you to transform the JS flow control constructs (loops, if/else, try/catch).

@creationix
Copy link

Great explanation Bruno, but then wouldn't it be more fair to use a trampoline in the bench function for the pure js version? Though the code will be more ugly and probably not what I would have written by hand.

(Btw, I had never heard this term. Now that I looked it up I have a word for a technique I used sometimes but didn't know how to explain)

@bjouhier
Copy link
Author

Well, the claim that so many people make is that hand-written code can only be faster. So, I wanted to compare streamline with callback code that people would normally write. I am sure that if I had posted the callback version above and asked if this was the right way to write a caching function and a loop, nobody would have complained that it should have used a trampoline instead because it is faster.

Also, streamline cannot use the standard while (typeof fn === "function") fn = fn(); trampoline because it has to deal with non-streamline functions that don't trampoline. So it has to use a more complex construct, which is a bit tricky to understand. I really doubt that anyone would feel like writing this construct every time he writes a loop.

But of course, I could have taken the code generated by streamline, removed a bit of extra fluff (like the little frame object that I use to generate long stack traces) and I would have obtained a faster version. But that's not the way people hand-write their callback code.

@creationix
Copy link

Ok, it took me a while to figure out, but I found a way to not need the nextTick using a trampoline style trick:

var count = 1000000;

function bench(cb) {
  var total = 0;
  function loop(i) {
    var async;
    while (async !== true) {
      async = undefined;
      if (i === count) {
        cb(null, total);
      } else {
        load(__dirname + '/benchCallbacks.js', function(err, data) {
          if (err) return cb(err);
          total += data.length;
          if (async) {
            loop(i + 1);
          } else {
            async = false;
            i++;
          }
        });
      }
      if (async !== false) {
        async = true;
      }
    }
  }
  loop(0);
}

With this I get speeds close to the streamline transform version.

While the speed differences in all these rarely matters in any real world application, the semantic difference between the trampoline, nextTick, and fibers is real and does matter.

@creationix
Copy link

FWIW, here are the speed numbers I'm getting on my machine with my new trampoline style version as benchCallbacks2.js.

tim ~/gist-2362015 $ uname -a
Linux touchsmart 3.2.13-1-ARCH #1 SMP PREEMPT Sat Mar 24 09:10:39 CET 2012 x86_64 Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz GenuineIntel GNU/Linux
tim ~/gist-2362015 $ node benchCallbacks.js 
hit=999999, missed=1, result=805000000
elapsed: 1297
tim ~/gist-2362015 $ _node benchStreamline
hit=999999, missed=1, result=805000000
elapsed: 1001
tim ~/gist-2362015 $ _node --fibers benchStreamline
hit=999999, missed=1, result=805000000
elapsed: 180
tim ~/gist-2362015 $ node benchCallbacks2.js 
hit=999999, missed=1, result=805000000
elapsed: 308

@bjouhier
Copy link
Author

Would have been a bit faster to cut and paste the streamline code in the online demo:

http://sage.github.com/streamlinejs/examples/streamlineMe/streamlineMe.html

My pattern has two loops. There is another one inside the callback. I think that this was necessary because I sometimes end up trampolining from the callback but this may also be something not very clever about my code generation (which is a bit contrained by other factors).

@isaacs
Copy link

isaacs commented Apr 13, 2012

When was the last time you had to load the same thing a million times in a row in a tight loop?

I can't figure out what program this benchmark is supposed to represent.

@nalply
Copy link

nalply commented Apr 13, 2012

What Bruno is doing is important because one should not write callback-style code in every case. Business logic for example or you are setting yourself up to a maintenance nightmare. Library code is fine because that's the hard core Node guys are accustomed to. :-)

I work currently at a (non-Node) project where the son of the boss knows a little programming. This is very helpful for the project because this makes communication a lot easier: he functions as a relay between the company and software development. But I cannot imagine that he could write or understand business logic in callback-style. I definitively would use Streamline in such a case!

The million times in a row is just one of the corner cases: Javascript does not do tail calls. creationix already mentioned that he used a trampoline, that means, the Node hard core guys already know a lot about these corner cases.

I have a plea to the Node community: Please pay more respect to Bruno! I think what he does is saving Node in the long term from fading away into a niche for really clever programmers.

@bjouhier
Copy link
Author

@JeanHuguesRobert
Copy link

As a side note, could it be that the fiber version runs faster because async calls are so much more expensive than sync ones?

See http://jsperf.com/asynch-cost

Surprised? well... you now understand better my "node anxiety" issue -- http://news.ycombinator.com/item?id=2371152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment