Skip to content

Instantly share code, notes, and snippets.

@bjouhier
Created April 11, 2012 20:01
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save bjouhier/2362015 to your computer and use it in GitHub Desktop.
Save bjouhier/2362015 to your computer and use it in GitHub Desktop.
streamline vs. callbacks bench
"use strict";
var fs = require('fs');
var cache = {}, hit = 0, missed = 0;
function load(name, cb) {
var res = cache[name];
if (res) {
process.nextTick(function() {
hit++;
cb(null, res);
});
} else {
fs.readFile(name, function(err, data) {
missed++;
cb(null, cache[name] = data);
});
}
}
var count = 1000000;
function bench(cb) {
var total = 0;
function loop(i) {
if (i === count) {
cb(null, total);
} else {
load(__dirname + '/benchCallbacks.js', function(err, data) {
if (err) return cb(err);
total += data.length;
loop(i + 1);
})
}
}
loop(0);
}
var t0 = Date.now();
bench(function(err, result) {
if (err) throw err;
console.log('hit=' + hit + ', missed=' + missed + ', result=' + result);
console.log('elapsed: ' + (Date.now() - t0));
});
"use strict";
var fs = require('fs');
var cache = {}, hit = 0, missed = 0;
function load(name, _) {
var res = cache[name];
if (res) {
hit++;
return res;
} else {
missed++;
return cache[name] = fs.readFile(name, _);
}
}
var count = 1000000;
function bench(_) {
var total = 0;
for (var i = 0; i < count; i++) {
var res = load(__dirname + '/benchCallbacks.js', _);
total += res.length;
}
return total;
}
var t0 = Date.now();
var result = bench(_);
console.log('hit=' + hit + ', missed=' + missed + ', result=' + result);
console.log('elapsed: ' + (Date.now() - t0));
@bjouhier
Copy link
Author

Well, the claim that so many people make is that hand-written code can only be faster. So, I wanted to compare streamline with callback code that people would normally write. I am sure that if I had posted the callback version above and asked if this was the right way to write a caching function and a loop, nobody would have complained that it should have used a trampoline instead because it is faster.

Also, streamline cannot use the standard while (typeof fn === "function") fn = fn(); trampoline because it has to deal with non-streamline functions that don't trampoline. So it has to use a more complex construct, which is a bit tricky to understand. I really doubt that anyone would feel like writing this construct every time he writes a loop.

But of course, I could have taken the code generated by streamline, removed a bit of extra fluff (like the little frame object that I use to generate long stack traces) and I would have obtained a faster version. But that's not the way people hand-write their callback code.

@creationix
Copy link

Ok, it took me a while to figure out, but I found a way to not need the nextTick using a trampoline style trick:

var count = 1000000;

function bench(cb) {
  var total = 0;
  function loop(i) {
    var async;
    while (async !== true) {
      async = undefined;
      if (i === count) {
        cb(null, total);
      } else {
        load(__dirname + '/benchCallbacks.js', function(err, data) {
          if (err) return cb(err);
          total += data.length;
          if (async) {
            loop(i + 1);
          } else {
            async = false;
            i++;
          }
        });
      }
      if (async !== false) {
        async = true;
      }
    }
  }
  loop(0);
}

With this I get speeds close to the streamline transform version.

While the speed differences in all these rarely matters in any real world application, the semantic difference between the trampoline, nextTick, and fibers is real and does matter.

@creationix
Copy link

FWIW, here are the speed numbers I'm getting on my machine with my new trampoline style version as benchCallbacks2.js.

tim ~/gist-2362015 $ uname -a
Linux touchsmart 3.2.13-1-ARCH #1 SMP PREEMPT Sat Mar 24 09:10:39 CET 2012 x86_64 Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz GenuineIntel GNU/Linux
tim ~/gist-2362015 $ node benchCallbacks.js 
hit=999999, missed=1, result=805000000
elapsed: 1297
tim ~/gist-2362015 $ _node benchStreamline
hit=999999, missed=1, result=805000000
elapsed: 1001
tim ~/gist-2362015 $ _node --fibers benchStreamline
hit=999999, missed=1, result=805000000
elapsed: 180
tim ~/gist-2362015 $ node benchCallbacks2.js 
hit=999999, missed=1, result=805000000
elapsed: 308

@bjouhier
Copy link
Author

Would have been a bit faster to cut and paste the streamline code in the online demo:

http://sage.github.com/streamlinejs/examples/streamlineMe/streamlineMe.html

My pattern has two loops. There is another one inside the callback. I think that this was necessary because I sometimes end up trampolining from the callback but this may also be something not very clever about my code generation (which is a bit contrained by other factors).

@isaacs
Copy link

isaacs commented Apr 13, 2012

When was the last time you had to load the same thing a million times in a row in a tight loop?

I can't figure out what program this benchmark is supposed to represent.

@nalply
Copy link

nalply commented Apr 13, 2012

What Bruno is doing is important because one should not write callback-style code in every case. Business logic for example or you are setting yourself up to a maintenance nightmare. Library code is fine because that's the hard core Node guys are accustomed to. :-)

I work currently at a (non-Node) project where the son of the boss knows a little programming. This is very helpful for the project because this makes communication a lot easier: he functions as a relay between the company and software development. But I cannot imagine that he could write or understand business logic in callback-style. I definitively would use Streamline in such a case!

The million times in a row is just one of the corner cases: Javascript does not do tail calls. creationix already mentioned that he used a trampoline, that means, the Node hard core guys already know a lot about these corner cases.

I have a plea to the Node community: Please pay more respect to Bruno! I think what he does is saving Node in the long term from fading away into a niche for really clever programmers.

@bjouhier
Copy link
Author

@JeanHuguesRobert
Copy link

As a side note, could it be that the fiber version runs faster because async calls are so much more expensive than sync ones?

See http://jsperf.com/asynch-cost

Surprised? well... you now understand better my "node anxiety" issue -- http://news.ycombinator.com/item?id=2371152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment