Skip to content

Instantly share code, notes, and snippets.

@addyosmani
Last active May 28, 2022 22:40
Show Gist options
  • Star 24 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save addyosmani/a74668d403c6de5b86b25e9daf6fe385 to your computer and use it in GitHub Desktop.
Save addyosmani/a74668d403c6de5b86b25e9daf6fe385 to your computer and use it in GitHub Desktop.
Thoughts on precompiling JS bytecode for delivery through a server/CDN

Some quick thoughts on https://twitter.com/dan_abramov/status/884892244817346560. It's not ignorant at all to ask how browser vendors approach performance. On the V8 side we've discussed bytecode precompilation challenges a few times this year. Here's my recollection of where we stand on the idea:

JavaScript engines like V8 have to work on multiple architectures. Every version of V8 is different. The architectures we target are different. A precompiled bytecode solution would require a system (e.g the server or a CDN) to generate bytecode builds for every target architecture, every version of V8 supported and every version of the JavaScript libraries or bundles bytecode is being generated for. This is because we would need to make sure every user accessing a page using that bytecode can still get the final JS successfully executed.

Consider that if a cross-browser solution to this problem was desired, the above would need to be applied to JavaScriptCore, SpiderMonkey and Chakra as well. It would need to carefully deliver the right bytecode per target or risk wasting bandwidth having to go back and forth between the client and server until a compatible version was found.

In addition, a bytecode solution would need to go through security and validation phases before an engine could accept something prebuilt coming down the wire. A CDN would need to support fallbacks for such bytecode not being interpretable by the target (e.g imagine a browser that doesn't support this bytecode accessing the service - it would need to provide a normal JS bundle as the fallback).

Practically speaking, given how different every JS engine is, we would likely need to craft something higher level than Ignition in V8 to be able to explore such an idea. Having discussed bytecode precompilation with the V8 team multiple times this year, there's a risk that unless exact V8 ignition code was shipped we would probably have to still do a lot of the expensive work we are already doing (which makes the idea of standardizing on an intermediate representation a little tricker).

The Ignition bytecode isn't a stable binary format, nor is it intended to be (we want to be able to change it with every V8 release in order to allow optimizations to new language features or passing additional operands in the bytecode to be used for type feedback by the optimizing compilers). We would also need to verify any bytecode which came over the wire to ensure it is valid and doesn't contain security exploits (e.g., accessing the stack out-of-bounds).

The idea itself has a lot of nuances to it. E.g imagine if a CDN or server shipped code allowing us to avoid parsing inner functions: it would need to provide us all the information for optimisations we do (e.g tracking whether variables are assigned after initialization) which is a moving target in addition to newer language features appearing that would require an intermediate representation to keep up to date. There's some skepticism about how much a bytecode solution would actually save (e.g I don't think you would see anything near 50% at the point where we just parse and compile a single function at a time).

None of this is to say that the idea can't or shouldn't be explored, it's just a little more complex a problem space than it might initially seem. An intermediate bytecode representation a CDN could use would need to be something which is cross-browser - perhaps structurally similar to WebASM bytecode (but with JS level semantics instead of machine semantics).

@addyosmani
Copy link
Author

Will two sites that use the same library reuse the same bytecode even when they are on two different urls? (if it is hit a few times within 72 hours)

The bytecode cache is not currently able to match and reuse bytecode for the same resource from different URLs. It's unclear to me why it would need to do this based on the original proposal - if a CDN is hosting React (let's take Cloudflare):

https://cdnjs.cloudflare.com/ajax/libs/react/15.6.1/react.min.js

You're caching this asset for the cloudflare.com origin. My understanding of the current implementation is if URL A and URL B both reference the above, they would be able to take advantage of the bytecode cache for react.min.js.

if we where able to tag a url or snippets of code with enough info, the browser should be able to skip downloading and parse it the second time (even on another url) and use the bytecode at once.

The best way to currently do this would be referencing a CDN URL that is likely to already have been code cached (something that should be true if a number of sites have been visited that reference it). We can chat about what a tagging/annotation proposal could look like but simply doing it based on URLs is a pretty low-friction way to take advantage of the bytecode cache.

@mbrevda
Copy link

mbrevda commented Jul 15, 2017

It seems there is still place for non CDN hosts to say "I can offer you library x, with hash y" if you want it, and the browser can say "I've already got x with hash y as cached". This would take caching past unique url's, and allow code that was hosted by a given domain to be shared by a different domain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment