addyosmani/bytecode.md

## bytecode.md

      
    Raw
  

              bytecode.md
            
          
    Some quick thoughts on https://twitter.com/dan_abramov/status/884892244817346560. It's not ignorant at all to ask how browser vendors approach performance. On the V8 side we've discussed bytecode precompilation challenges a few times this year. Here's my recollection of where we stand on the idea:
JavaScript engines like V8 have to work on multiple architectures. Every version of V8 is different.
The architectures we target are different. A precompiled bytecode solution would require a system (e.g the server
or a CDN) to generate bytecode builds for every target architecture, every version of V8 supported and
every version of the JavaScript libraries or bundles bytecode is being generated for. This is because we would
need to make sure every user accessing a page using that bytecode can still get the final JS successfully executed.
Consider that if a cross-browser solution to this problem was desired, the above would need to be applied to JavaScriptCore,
SpiderMonkey and Chakra as well. It would need to carefully deliver the right bytecode per target or risk
wasting bandwidth having to go back and forth between the client and server until a compatible version was
found.
In addition, a bytecode solution would need to go through security and validation phases before an engine
could accept something prebuilt coming down the wire. A CDN would need to support fallbacks for such bytecode
not being interpretable by the target (e.g imagine a browser that doesn't support this bytecode accessing the
service - it would need to provide a normal JS bundle as the fallback).
Practically speaking, given how different every JS engine is, we would likely need to craft something higher level
than Ignition in V8 to be able to explore such an idea. Having discussed bytecode precompilation with the V8 team
multiple times this year, there's a risk that unless exact V8 ignition code was shipped we would probably have to still
do a lot of the expensive work we are already doing (which makes the idea of standardizing on an intermediate representation
a little tricker).
The Ignition bytecode isn't a stable binary format, nor is it intended to be (we want to be able to change it with every V8 release in order to allow optimizations to new language features or passing additional operands in the bytecode to be used for type feedback by the optimizing compilers). We would also need to verify any bytecode which came over the wire to ensure it is valid and doesn't contain security exploits (e.g., accessing the stack out-of-bounds).
The idea itself has a lot of nuances to it. E.g imagine if a CDN or server shipped code allowing us to avoid parsing inner functions: it would need to provide us all
the information for optimisations we do (e.g tracking whether variables are assigned after initialization) which is a
moving target in addition to newer language features appearing that would require an intermediate representation to keep
up to date. There's some skepticism about how much a bytecode solution would actually save (e.g I don't think you
would see anything near 50% at the point where we just parse and compile a single function at a time).
None of this is to say that the idea can't or shouldn't be explored, it's just a little more complex a problem space than
it might initially seem. An intermediate bytecode representation a CDN could use would need to be something which is
cross-browser - perhaps structurally similar to WebASM bytecode (but with JS level semantics instead of machine semantics).