Skip to content

Instantly share code, notes, and snippets.

@bmeck
Last active November 2, 2017 21:41
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bmeck/52ee45e7c34d1eac44ce8c5fe436d753 to your computer and use it in GitHub Desktop.
Save bmeck/52ee45e7c34d1eac44ce8c5fe436d753 to your computer and use it in GitHub Desktop.
New path for Node trying to support ES Modules
// ALL ES Modules will be available via import(), roughly translated as (PSEUDO-CODE)
module.exports = System.loader.import(__filename);
// ANY DEPENDENCY THAT IS NOT ES WILL BE TRANSLATED AS THE FOLLOWING ROUGHLY (PSEUDO-CODE)
// note: w/e polyfill Node produces will have a utility method to convert values to es-wrapped Reflect.Modules
// note: require will not be present in real ES modules
// note: this wrapper has the value of `default` defined at end of the *first* require of a non-es module
// ie. mutating module.exports after stack unwinds for CJS will not be visible
// WRAP **ALL** CJS evaluation to have the following at the bottom of evaluation
export default module.exports;
'use strict';
// we need a polyfill for `Reflect.Module` and `System.loader` from https://whatwg.github.io/loader/
// require('some-loader-polyfill');
module.exports = Promise.resolve(new Reflect.Module(
{
// https://whatwg.github.io/loader/#parse-exports-descriptors
},
(mutator, namespace) => {
// this fires synchronously
//
System.loader.registry.set(`file:///${__filename}`, {
mutator,
namespace,
context: Object.freeze({
url: `file:///${__filename}`,
main: `file://${require.main.filename}`,
})
});
},
function () {
// this fires asynchronously
const {
mutator: $hygenic_mut,
namespace: $hygenic_ns,
context: $hygenic_context,
} = System.loader.registry.get(`file:///${__filename}`);
// transpilers:
// NOTE: cannot perform full init check for global variable TDZ, :shrug:
//
// translate `arguments` => `global.arguments`
// translate `module` => `global.module`
// translate `require` => `global.require`
// translate `exports` => `global.exports`
// translate `__filename` => `global.__filename`
// translate `__dirname` => `global.__dirname`
// translate export assignment => $hygenic_mut[k](v)
// translate export access => $hygenic_ns[k]
// translate import fs from 'fs'; =>
// hoisted `const {default: fs} = await System.loader
// .import(..., $hygenic_context.url)));
// upcoming spec:
//
// import.context => $hygenic_context
//
// translate import(...) => Promise.resolve(void 0).then(()=>
// System.loader.import(..., $hygenic_context.url)))
// );
//
// translate top level await => ...
// {{code}}
}
)).then(
// evaluate then return the ns
ns => Reflect.Module.evaluate(ns).then(()=>ns)
);
  • Treating import specifiers as URLs means URL encoding would be decoded automatically.
    • things with : in their first path segment get weird (already weird on windows w/ drive letters)
    • things with % in their path segments would be decoded
    • things with ? or # in their paths would face truncation of pathname
  • Async nature
    • transpilers need to ensure dependencies evaluate prior to dependents, but dependents create their NS prior to dependencies
  • Recommend inability to load ES Modules via require means new code ideally uses import() only.
  • Replacing module.exports after CJS has finished evaluation will not be respected.
  • Any CJS Module (including transpiled from ESM) will only produce default exports.
  • import() means destructuring out the default export... which looks crappy. C'est la vie
CJS form ES module form needs external (TC39/WHATWG/...) spec help current spec/environment (Aug 2016) would allow workflow notes
__filename import.context.url Y N use new URL(relative, import.context.url) for relative pathing
__dirname import.context.url Y N use new URL(relative, import.context.url) for relative pathing
require(...) import {...} from '...';/ await import(...) Y N pending await import this is asynchronous, not blocking
exports import(import.context.url) Y N asynchronous promise, cannot mutate the result, use normal export syntax instead
require.extensions not available N/A N/A deprecated, gone
module not available N N/A necessary data migrating to import.context or possible Map in a new core module
require.resolve(...) TBD N Y will asynchronous , presumably in a new core module
require.cache TBD N N, need cache that is compat with: HTML, Loader, and CJS cache https://whatwg.github.io/loader/#registry , presumably a new core module
require.main / process.mainModule TBD N Y TODO: how to determine module is an entry point (yes there can be multiple / orphans)
old CJS workflow new CJS workflow Needs spec work notes
require('x') require('es')(to deprecate) / import('x')(eventual recommendation) Y for import returns a Promise<ModuleNamespace> , does not evaluate immediately ever

There are 2 paths we can take when attempting to implement modules.

  1. Make things work with existing semantics of Babel/Node
  2. Use new semantics that match what web browsers will be doing.

Broken babel assumptions

These breaks will require future transpilers to implement a different interop to match true ES module semantics, this will cause a major break for any currently transpiled modules seeking to upgrade to semantics compatible with any real ES interop.

With this in mind, I highly recommend a new approach, rather than seeking to keep existing babel/node semantics, we make efforts to have Node and the browser semantics match as closely as possible. I will make efforts to move proposals forward to regain any lost functionality during the transition, but will be making a new priority towards making a unified ecosystem in the future even if the existing ecosystem requries using different syntax/semantics/workflows to achieve that behavior.

  • VM/Spec cannot feasibly implement named imports from non-ES modules (CJS/JSON/C++)
  • same issues that with() has with regard to variable access deopt/guarding
  • problems with value of this when accessing variables, exporting something like a Promise loses it's this value
  • problems with transitivity, by creating a "view" of the value being exported we change how people could re-export ( microsoft/TypeScript#4513 , microsoft/TypeScript#9562 )
  • Top level await coming means problems with synchronous execution of ES modules
  • top level await is needed for dynamic paths / conditional loading when initializing a module

Web compat concerns

Lets use this import(URI)=>Promise. ANY URI provided will not start evaluation until after the stack unwinds, even if the Module is cached, or the URI is a Data URI. I have confirmation via Domenic that when I pass in import("data:application/javascript,alert(1)");alert(2), the alert 2 would always fire first.

To avoid zalgo of differing timing, we should assert that we must do the same.

This becomes very important for when talking about linking, as it affects if it is possible to do require(...)=>ModuleNamespace. If the evaluation is not done synchronously, Modules cannot define what the list of exports they have synchronously. This effectively prevents synchronous formation of results for loading a Module. Even if we get the source text and read it synchronously, transpiled modules create their namespace during evaluation, which we would need to defer, so synchronous parsing does not fix this problem.

Even if we could form the ModuleNamespace synchronously, unless the exports are not in a TDZ accessing them will throw errors. Since let and const are becoming more common (including how class declarations work now), this seems like it would be a common problem.

An advantage to the async initialization is that it opens the possibility of async loading of things like network resources which may be impractical to obtain while blocking. Another advantage is this means it may be possible to parallelize IO for reading/parsing of dependencies.

Non-local paths will not be searched

  • NODE_PATHS
  • ~/.node_libraries
  • etc.

How to determine how to initialize path searching

  • import.context.url should represent the current script's absolute URL

  • if $path parses via URL parser

    • let url be the result
  • else if ^[.]?[.]?[/] prefixes $path

    • let url be the result of new URL($path, import.context.url)
    • if url points to a directory
      • let url be the result of searching the directory for package.json or index.*
    • if url does not point to a file
    • for all well known file extensions .mjs .js .json .node
      • let searchUrl be url.pathname += extension
      • if searchUrl points to a file
        • let url be searchUrl
  • else

    • let url be the result of searching node_modules using $path and import.context.url
    • NOTE: checks for escaping modules to node_modules/ via ../ going to be added, can be re-added if needed / does not conflict with browser side
  • load url

  • we should support data: and file: out of the box

  • file: should use file type to determine dependency type (ES Module / JSON / C++ / CJS).

  • data: should interpret javascript MIME as ES Module. ES Modules share the MIME with CJS, so we don't have enough data to differentiate, so just assume ES Module since that is what the browser will assume.

Determining mode of the Dependency

  • Parsing a file and evaluating as an ESM will be detected using a .mjs file extension

    • explicit, no parsing complexity on system or programmer

      • browsers will not be using source text content to fall back to different mode
      • no accidental swapping of mode by including/excluding import/export declarations
    • mode is static, not determined by consumer which could assume the wrong mode, and not determined at runtime

    • allows "poly packaging", load new file extension prior to .js, older Nodes can fallback

    • out-of-band but in same file, no searching to find where the mode is defined (in source text or in directory structure)

Upcoming spec required to implement all of this

These specs are not even proposed yet, but have been talked about in TC39 meeting before at length. They should be proposed in next couple of meetings.

  • dynamic import (@dherman)
  • top level await (?)
  • import.context (@bmeck)
@martinheidegger
Copy link

I tried to find it in the document but failed to do so let me ask: Are https://nodejs.org/api/modules.html#modules_cycles considered?

@bmeck
Copy link
Author

bmeck commented Aug 25, 2016

@martinheidegger they work and can be explained via the steps above, but any gotchas aren't documented here. Unlike the older proposals there should not be a need to throw when you do strange circular dependencies; this is because everything moves into a Promise/unwinding stack means you won't get to loading ESM inside of CJS call stack ever.

@sindresorhus
Copy link

require.cache TBD N N, need cache that is compat with: HTML, Loader, and CJS cache

It's a very common pattern to use the require.cache to get the parent module. Seems this will no longer be possible. Knowing the parent module is useful to be able to read its package.json and automatically set some options that would usually need to be manually defined.

@bmeck
Copy link
Author

bmeck commented Sep 6, 2016

@sindresorhus once there is a stable cache API there will be a way to access the cache, but require.cache won't be the API. As for module.parent, we could introduce something if I knew the requirements more than just a link to code but ESM can have multiple concurrent parents.

@sindresorhus
Copy link

@bmeck Ok, to be clearer, I need to know the path of the caller module.

┌───────────────┐  ┌───────────────┐
│  User module  │  │ package.json  │
└───────┬───────┘  └───────────────┘
        │
        │
        │
┌───────▼───────┐
│   My module   │
└───────────────┘

Here, "My module" needs to know the package name of "User module". Normally, the user would have to specify that in an option when calling "My module" from "User module", but by "My module" getting it automatically from the "User module" package.json, I can make it easier for the user.

@bmeck
Copy link
Author

bmeck commented Sep 6, 2016

@sindresorhus I understand but ESM has no guarantee that there is a single caller (in fact caller would be a misnomer here)

┌───────────────┐  ┌───────────────┐
│  A module     │  │ B Module      │
└───────┬───────┘  └───────┬───────┘
        \                  /
         \                /
          ------ ▼ ------
                 |
         ┌───────▼───────┐
         │   My module   │
         └───────────────┘

You can have 2 modules pending on My module and have 1 of 2 fail due to a variety of reasons. The modules both concurrently and in parallel are trying to load My module so there is no singular "parent".

@bmeck
Copy link
Author

bmeck commented Sep 6, 2016

Would need to think on idempotency of https://tc39.github.io/ecma262/#sec-hostresolveimportedmodule wrt this as well @sindresorhus , think it would be ok to remove it from cache, but old links to module would always stay out of date

@dead-claudia
Copy link

I have a few related ideas to throw at the wall to fill some of the voids:

  1. Expose an import.loader object which contains the current loader (including the registry and various hooks).
  2. Expose an import.context object which contains the current file's context (e.g. import.context.file = module's filename).
  3. Extend @domenic's import(module) proposal to include import(module, loader = import.loader, referrer = import.context.file), to support custom loaders.
  4. Expose import.resolve and import.load with the same arguments as above to fill the gaps of loader.resolve and loader.load.

Each of these would be syntax as well, matching @domenic's import(module) proposal, but would serve as extensions to the ES spec, not be incorporated to the spec itself (except for specifically import.resolve(module), which is also rather necessary for dynamic loading).


Oh, and wrapping System.loader[Reflect.Loader.translate] is how you would register extensions with the current WHATWG spec. It's just a lot more indirect, and you can't remove or change it afterwards. You could create a wrapper to add require.extensions-like support, though.

@bmeck
Copy link
Author

bmeck commented Sep 12, 2016

@isiahmeadows indeed, I am trying to work out the loader hooks / context stuff right now so I can discuss things at TC39 outside of the agenda.

  1. / 2. probably going to be solved by a special module specifier (js:context? [feel free to bikeshed]). Ensures linking safety (can't import something that isn't in the context) and isn't a meta-property. import.loader may be something some environments don't want to expose.
  2. Seems sanish to me (will think on this), need a way for the loader to get the current script etc. and that is one way. url instead of file though.
  3. Neutral to this, but feel like it might be better suited to staying just on a Loader object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment