mraleph/forEach.md

## forEach.md

      
    Raw
  

              forEach.md
            
          
    Making iteration faster in Array built-ins.

Lets imagine we have forEach method. The core loop of this method in V8 is implemented in JavaScript and looks like this:
for (var i = 0; i < length; i++) {
  if (i in array) {
    var element = array[i];
    %_CallFunction(receiver, element, i, array, f);
  }
}
[%Foo(...) is V8's syntax for calling intrinsics and runtime functions from self-hosted runtime code]
the most expensive part of this loop in V8 is i in array check that corresponds to HasProperty check that specification mandates. It is expensive (for no good reason) even if array is dense.
How could we reduce the cost of this check?
Obviously we should rely on V8's ability to track array's kind. For any given array V8 tracks and can quickly check whether this array is

fast & dense: represented as a continuous chunk of memory and every property from 0 to array.length-1 is present;
fast & holey: represented as a continuous chunk of memory but some properties between 0 and array.length-1 are absent, underlying representation contains a special hole value at absent places;
slow: represented as a dictionary, usually happens when there are too many holes.

It makes perfect sense to provide specializations of the loop above for each of these cases. I am going to describe these specializations in JavaScript but this does not mean that they have to be written in JavaScript like that. It's just an easiest way to convey the idea.
fast & dense

This case is the simplest:
if (%_IsFastDenseArray(array)) {
  for (var i = 0; i < length; i++) {
    if (!%_IsFastDenseArray(array) || (i >= array.length)) {
      // If array's kind has changed then switch to generic loop.
      return GenericLoop(i, ...);
    }
    // (i in array) === true here because array is dense.
    var element = array[i];
    %_CallFunction(receiver, element, i, array, f);
  }
}
Array load would in any case check for array's kind and perform a bounds check so we did not add any extra overhead. If optimizer succeeds in fully inlining f and it sees that it does not change array's kind and length (or it knows from somewhere that f is pure) then it can even complete eliminate checks inside the loop.
fast & holey

When we hit a hole we need to go look at the prototype. However no reasonable person ever puts numeric properties on the Array.prototype or Object.prototype and if they do they should be reasonably punished with worse performance :) Optimizing for reasonable case:
if (%_IsFastHoleyArray(array)
    && %_NoElementsInPrototypeChain(array)) {
  for (var i = 0; i < length; i++) {
    if (!%_IsFastHoleyArray(array) 
        || (i >= array.length) 
        || !%_NoElementsInPrototypeChain(array)) {
      // If array's kind changed or callback added 
      // a numeric property into prototype chain then 
      // switch to generic loop.
      return GenericLoop(i, ...);
    }
   
    // Try to load element, usually if we hit hole 
    // we have to go to the prototype
    // but we know that we don't need to as prototype
    // has no such properties.
    // In this case we can just skip any hole we hit.
    var element_or_hole = %_At(array, i);
    if (%_IsHole(element_or_hole)) continue;

    %_CallFunction(receiver, element, i, array, f);
  }
}
Here the biggest question is how to implement %_NoElementsInPrototypeChain efficiently. The simplest solution is to turn this check into a guard: when VM detects that somebody puts numeric properties on either Array.prototype or Object.prototype it disables this optimized specialization of loop and if there is an activation on the stack of this method it deoptimizes it. With such tracking %_NoElementsInPrototypeChain(...) becomes a no-op.
[V8 already applies similar guarding mechanism to prototypes, but instead it tries to detect when new named properties are added/removed and deoptimizes code that was assuming that certain properties were absent/present]
slow

I would argue that if somebody applying forEach to a very sparse array should not expect good performance out of it (given how it is specified). A possible way to optimize this cache is too have an enumeration sequence attached to the dictionary that lists indices to be visited. This enumeration sequence should be discarded every time dictionary properties are added or removed.
if (%_IsSlowArray(array) && %_NoElementsInPrototypeChain(array)) {
  %_RecomputeEnumerationSeq(array, length);
  let seq = %_GetEnumerationSeq(array);
  for (var i = 0; i < seq.length; i++) {
    if (!%_IsSlowArray(array) 
        || (seq !== %_GetEnumerationSeq(array))
        || !%_NoElementsInPrototypeChain(array)) {
      // If array kind or enumeration sequence changed
      // or callback added a numeric property into the 
      // prototype chain or then we switch to generic loop.
      return GenericLoop(seq[i - 1].index + 1, ...);
    }

    // Enumeration sequence can even tell us precisely 
    // where this property is in a dictionary so that we 
    // don't have to perform hash at all.
    var element = %_DictionaryAt(array, seq[i].location);
    %_CallFunction(receiver, element, seq[i].index, array, f);
  }
}