RobertFischer/Description.md

## Description.md

      
    Raw
  

              Description.md
            
          
    So, I was reading
Why You shouldn’t use lodash anymore and use pure JavaScript instead,
because once upon a time, I shifted from Underscore to Lodash, and I'm always on the lookout for the bestest
JavaScript stdlib. At the same time, there was recently an interesting conversation on Twitter about how some of React's
functionality can be easily implemented in modern vanilla JS. The code that came out of that was elegant and impressive,
and so I have taken that as a message to ask if we really need the framework.
Unfortunately, it didn't start out well. After copy-pasting the ~100 lines of code that Lodash executes to perform a
find, there was then this shocking claim:
.
To give you some perspective on these numbers, let's assume we're kicking around on your laptop, lazily executing at 2.0gHz.
This puts you at 2,000,000,000 (2 billion) cycles per second, which is also known as 2,000,000 (2 million) cycles
per millisecond. Therefore, to say that Lodash takes 140ms means that it is requiring 280,000,000 (280 million) cycles.
That means that each of the ~100LoC that Lodash is executing are taking roughly 28 million cycles each to execute (on average).
Even given the inefficiency of JavaScript in the browser, that's crazy. Something else is going on.
The code they executed
was a naive attempt at benchmarking, and it's a good case in point about how NOT to do benchmarking, so I'm taking the educational
opportunity and laying out the critiques. The improved file is below.
Instantiating Test Data

So, it starts with the instantiation of the data.
This code looks entirely innocuous:
const users = [
  { 'user': 'barney',  'age': 36, 'active': true },
  { 'user': 'fred',    'age': 40, 'active': false },
  { 'user': 'pebbles', 'age': 1,  'active': true }
];
However, when it comes to benchmarking, you've got a problem. It's entirely possible that the JIT will encounter this
declaration, see the opening brace, and then skip to the closing brace without ever parsing or lexing the contents.
All the lexer really needs to know is that users is not null/undefined, is an array, and does not use lexically
scoped variables, which can be determined in most cases without even parsing the contents of the array.
Everything else is a runtime detail.
Even if it is parsed, the reification of the parsed content into actual user-space objects may well be delayed until
later: the source code itself for those objects may be held by the JIT, and only turned into bytecode executions on
demand. This is exactly what JIT means: Just In Time.
So, when is that code in the middle going to be processed? When it's first called -- which, in this case, is while we
are timing Lodash. So Lodash's time also (potentially, depending on your runtime) includes the time to parse, lex, and
reify the object. Once we get to the native version, that's already done, so the native doesn't pay that cost.
To fix this, we need to ensure that we've exercised the test data, touching all the pieces that our timing tests will
touch.
Signal vs. Noise

In both of the original tests, they start the timer. Once the timer starts, they time the following things:

Time to return from new Date() after capturing the result of the gettime system call (or equivalent).
Time to parse/lex/reify the code backing the test.
Time to execute the test (what we want).
Time to perform console.log the display the result of the test.
Time to instantiate another new Date(), including another gettime system call (or equivalent).

The issue is that what we want to test is really, really fast. Everything surrounding the test itself takes a lot of
time, which means there is a lot of noise clouding up a very slight signal. If the time for this noise was consistent
between the two executions, that'd arguably be okay -- at least it would be an apples-to-apples comparison -- but given
the variance introduced by JIT, GC, and a system call, I don't have any confidence that the noise is consistent, and
the noise could easily overwhelm the signal.
So what we need to do is boost the signal. The simplest way to do this is to simply increase the number of times that
we perform the operation. The amount that we have to boost the signal has to do with the amount of noise on your
particular environment, and the amount of memory you're willing to commit to it (keeping in mind that memory allocation
is yet another source of noise), so we'll make that a configuration const that we can play with.
Hotspot-style Optimizations

The last major concern about this test is hotspot-style optimizations. It's entirely possible that a runtime will spend
additional effort optimizing bytecode which is executed often. This means that running a test once might only be showing
you only the entirely unoptimized time. To address this, along with ensuring that you aren't measuring the time to
parse/lex the implementation, we need to run the test a few times before we start the clock.
The Result is Below


## file-vs-file-fixed.js
const iterations = 1000;
console.log("iterations", iterations);

const users = [
  { 'user': 'barney',  'age': 36, 'active': true },
  { 'user': 'fred',    'age': 40, 'active': false },
  { 'user': 'pebbles', 'age': 1,  'active': true }
];
console.log("timing data");
users.forEach(({user,age}) => console.log(user,age));

const startTime = new Date();
console.log("timing start", startTime.toString());

const findLodash = () => _.find(users, ({age}) => age < 40);
console.log('lodash find warm-up', _.times(iterations, findLodash));
const timerDateFindLodash = new Date();
console.log('lodash find', _.times(iterations, findLodash));
console.log('lodash time (sec/operation)', (new Date() - timerDateFindLodash)/iterations);

const findNative = () => users.find(({ age }) => age < 40);
console.log('native find warm-up', _.times(iterations, findNative));
const timerDateFindJS = new Date();
console.log('native find', _.times(iterations, findNative));
console.log('native time (sec/operation)', (new Date() - timerDateFindJS)/iterations);

## find-vs-find-original.js
const users = [
  { 'user': 'barney',  'age': 36, 'active': true },
  { 'user': 'fred',    'age': 40, 'active': false },
  { 'user': 'pebbles', 'age': 1,  'active': true }
];

const timerDateFindLodash = new Date();
console.log('lodash find', find(users, ({ age }) => age < 40));
console.log('lodash time', new Date() - timerDateFindLodash);

const timerDateFindJS = new Date();
console.log('native find', users.find(({ age }) => age < 40));
console.log('native time', new Date() - timerDateFindJS);

## ResultsOfFixed.md

      
    Raw
  

              ResultsOfFixed.md
            
          
    Timing

When I run the fixed script, both the lodash and the native versions run at nearly 0 time, with the native version
being ever so slightly more nearly 0 than lodash. This makes sense, since Array.prototype.find makes assumptions
about its input (not null/undefined, an array-like object, etc.), whereas Lodash does a ton of backflips to try
to coerce the concept of find to match the argument's type.
In short, the punchline: let's not worry about getting rid of Lodash's utility for FUD about performance.
Addendum

The last concern that I had -- which turned out to be unfounded -- is that the JavaScript JIT would be smart enough to
know that the users could not change, and therefore just returning the same value for each _.find or users.find
without actually executing it again. But it looks like that's not happening, because increasing the number of
iterations seems to increase the time it takes as a more-or-less constant factor (at least up to 1,000,000, which is
when I stopped testing -- if JIT hasn't kicked in that optimization by then, it won't going to for all practical
purposes).
	const iterations = 1000;
	console.log("iterations", iterations);

	const users = [
	{ 'user': 'barney', 'age': 36, 'active': true },
	{ 'user': 'fred', 'age': 40, 'active': false },
	{ 'user': 'pebbles', 'age': 1, 'active': true }
	];
	console.log("timing data");
	users.forEach(({user,age}) => console.log(user,age));

	const startTime = new Date();
	console.log("timing start", startTime.toString());

	const findLodash = () => _.find(users, ({age}) => age < 40);
	console.log('lodash find warm-up', _.times(iterations, findLodash));
	const timerDateFindLodash = new Date();
	console.log('lodash find', _.times(iterations, findLodash));
	console.log('lodash time (sec/operation)', (new Date() - timerDateFindLodash)/iterations);

	const findNative = () => users.find(({ age }) => age < 40);
	console.log('native find warm-up', _.times(iterations, findNative));
	const timerDateFindJS = new Date();
	console.log('native find', _.times(iterations, findNative));
	console.log('native time (sec/operation)', (new Date() - timerDateFindJS)/iterations);