Skip to content

Instantly share code, notes, and snippets.

@WebReflection
Last active October 20, 2021 14:54
Show Gist options
  • Save WebReflection/a4e026e78af45ede8d0f7498cab44f91 to your computer and use it in GitHub Desktop.
Save WebReflection/a4e026e78af45ede8d0f7498cab44f91 to your computer and use it in GitHub Desktop.
Solving the "ESM in NodeJS" Odyssey.

Solving the "ESM in NodeJS" Odyssey.

After months of discussions in a dedicated group, it's clear to me NodeJS is still stuck in finding a way to deliver native ESM to its users.

The "usual few" won't hear anything different from .mjs, but .mjs has been demonstrated to be not a solution neither.

Following few cases not covered by .mjs:

  • evaluation of any string, via CLI, or on demand, where there is no extension
  • tools that convert their syntax into JS, since it always worked to date (thanks to transpilers, bundlers, and loaders)
  • executable files, with or without #!/usr/bin/env node, inevitably shipped without .mjs extension
  • browser interoperability, since browsers couldn't care less about extensions because they use a known parse goal

What is a parse goal

Using the only env that shipped already ESM as reference, there are differences between a <script> tag, and a <script type=module> one.

The former is what we've been using for the last 20 years: it brings JS with all its history, caveats, sloppy mode by default, and its "just run (maybe, please)" ability.

The type=module case though, will execute the content implicitly as "use strict", and will make static import and export available as native module system, as defined by ECMAScript standard.

On browsers, it's never ambiguous

With statically analyzable syntax everything is fine.

However, since ECMAScript is moving forward, and mostly every new project is being developed through ESM, a mechanism to bring new modules into legacy code as migration pattern was needed, and this is where dynamic import('...') shines.

Even inside a regular script, import('./file.js') will load ./file.js as ESM, implicitly switching the parsing goal of that file as ESM/module.

<script>
// import './file.js'; // would throw right away

import('./file.js').then(esm => {
  console.log('default export', esm.default);
  console.log('bindings', Object.keys(esm));
});
</script>

Browsers VS .mjs

A browser doesn't care about any extension, including .mjs, unless it's running without a server. Even though, if the type=module has been explicitly set, loading .js as module would be just fine.

Bear in mind, loading .mjs without a known parsing goal upfront, won't produce the expected result, so that .mjs, like pretty much any other extension, won't make any difference.

Mime VS .mjs

The mime type in browsers, provided as response header, is like some byte parsing at the beginning of a file, to understand despite the file extension what kind of file it is.

If you rename a .pdf file into .txt and double click it, you'll still most likely view the PDF, even if the icon on the UI shows text instead of PDF.

The take from this paragraph is that Mime is superior to extension, and with same mime, as it is for browsers, an explicit parsing goal is also superior to mime (i.e. same mime type text/javascript, served as type=module, will be parsed and executed as ESM, not as regular JS).

The NodeJS Caveat

In NodeJS, the equivalent of the good old 20yo <script> code on the browser, is called CommonJS.

Not only CommonJS is not defined by ECMAScript, they also defined their own mime type as application/node.

Accordingly, any .js file is historically considered a CommonJS one, with the well known, battle tested, require(...) module system, that served NodeJS well 'till now (and forever in case ESM won't ship ...).

There are other files NodeJS can require(...), such .json or .node, and there could be others in the future, but now that the same .js file could contain a different module system, AKA a different parsing goal, we need a solution to disambiguate what that .js file is.

The Solution Is From The Standards

If you can connect the previous dots, you'll agree with me the solution is already there and it works: it's --type=module or, like SpiderMonkey and JSC did already, the --module or -m flag.

That's it. 2 Years wasted to not ship ESM in NodeJS solved, whenever the author of the file wants to run .js as modern ESM syntax, it uses the -m flag 🎉

Well ... it's not so easy 😭

The NodeJS Issue

The biggest strength of the NodeJS community is the registry of modules available, and freely published, by every sort of contributor around the world: it's called npm and it's awesome 😎

Somebody thinks it's something to regret though, but npm is there, with its package.json, and it does the job pretty damn well.

Since standards didn't want to depend on third parts APIs, ESM is incapable of understanding package.json or any CommonJS mechanism such module.exports = ..., and this is where the CHAOS begun.

Interoperability ?

Those "usual few" developers that are pushing for .mjs despite all the evidences it's not a solution, are apparently the only one convinced that ESM in NodeJS should be able to import CJS and vice-versa, going one more time against standards instead of converging into a unified module system that works everywhere.

Once you think you want to import stuff from 'cjs-module', or be able to import('esm-or-cjs-module'), you are the only author of a system full of ambiguity, and per each import operation, the most common NodeJS feature.

It gets worse ...

In CommonJS you don't need an extension to import a file, or an entire folder, because of the path resolution provided by npm with, or without, a package.json.

Not only this is the biggest Ryan Dahl regret, .mjs supporters ironically would allow you to import {x} from "./x" so that the extension can be omitted, and nobody would know if that is executed as x.js, x.mjs, or even x/index.js, in case it's a folder, plus the resolution algorithm would add an extra step of complexity, over something already regrettable.

Bear in mind that partial path qualifiers are allowed on Web servers too, but only if the mime type is text/javascript and it will always be loaded as ESM, no matter the content.

The Community Wants .js

The current status in the whole npm registry, is that 96.4% (using .js + .mjs as U) of the packages published to target explicitly ESM via the module field in the package.json points at a .js file.

screen shot 2018-07-19 at 21 05 07

These files are natively usable via static or dynamic web servers, with stand alone browsers, and are compatible with pretty much every bundler out there that understand ESM.

Accordingly, since everyone writing and publishing ESM for the last two years use ESM only and is backed, eventually, by bundlers when it comes to importing CommonJS, how could NodeJS easily please developers users and fully drop any possible ambiguity?

Dropping Interoperability !

Since nowadays transpilers, bundlers, and loaders, already cover pretty much all the possible shenanigans, and since these will still be used to migrate or publish production code, whoever is using tools will be just fine, while whoever is writing real ESM won't ever feel an itch.

The summary to my proposal has been described by Rob Palmer as such:

drop the requirement for import interop, for all specifiers except bare specifiers, which achieve interop via a package.json driven preference for an explicit ESM entrypoint (the module property)

What it means, in a verbose description, is the following:

  • the disambiguation can be provided either by --module flag, -m, or even in a browsish --type=module fashion.
  • once the parsing goal is either provided or understood:
    • what is ESM stays ESM
      • ESM can import.meta.require(...) any CJS module (least surprise)
      • ESM static and dynamic import will always treat both .mjs and .js files as ESM (least surprise)
    • what is CJS stays CJS
      • there is no static import or any import.meta in CJS
      • you can require(...) any CJS module (least surprise)
      • you can dynamically import(...) any .mjs or .js file as ESM

Rules to understand .json or even .wasm are irrelevant, since these file don't have any ambiguity, are untouched, and all bare imports as in import mod from 'module', where the identifier is not a path, can be still resolved the good old way where the package.json can provide the module file for ESM, and the main file for CJS.

How Would Authors Define Their Parse Goal

I've just explained that. As module author, you publish your package.json with a module field that points at the ESM entry point. If that's not available, whoever needs your module will need to use import.meta.require and call it a day: happy migration!

... but ...but ... the .mjs is ...

Unnecessary, irrelevant, overly complicated, non welcomed by the majority of the community.

Every single use case can be covered by the proposed flow, as long as developers understand CJS belongs to CJS, as ESM belongs to ESM. You can load both modules within others, as long as the module author allowed you to either require in CJS, or import in ESM, the needed module.

If you instead write your own code though, and you have bare specifiers, I am sure you know if the file you are writing are either ESM or CJS, so that there is never ambiguity even in that case.

A Migration For Everyone

Not only dropping interoperability simplifies everything, but authors that want to explicitly publish their modules as ESM would finally have an option, keeping around, or not, the old main field for all current CJS consumers.

Authors that won't maintain or update their packages, won't need to do anything, because consumers of their modules will (be forced to) use import.meta.require.

As summary, everything could be beautiful, nothing would break, and NodeJS ESM could land tomorrow fully capable of dual modules.

A Stretch Goal

Since the only eventual ambiguity to solve, at this point, is entirely in the entry file, i.e. node index.js or #!/usr/bin/env node on executable, and since an entry point inevitably would either require or import modules or, in case it's a stand alone file, simply execute, it is possible to move the double-parsing technique to understand if a file is ESM or CJS only for such entry point.

The default behavior could still be CJS, but as soon as the static import keyword is encountered, and it's not the dynamic one, the parsing goal can be changed as ESM.

In this way, every issue related to how a NodeJS program should be executed would be solved.

And that, my dear reader, would be the ideal outcome of this ESM in NodeJS Odyssey: bumpy, but finally perfect.


"Madness ... This Is Sparta!"

@zenparsing
Copy link

@WebReflection Regarding this one:

if there is a main field, and it points at file.mjs, then it's an ESM only module and it will throw via both require('module') and import.meta.require('module')

it seems simpler to leave out the "main+.mjs" rule since ESM devs can simply use "module" instead.

I think the main pushback has to do with deep package imports (e.g. import 'package/x'). Under this solution, how would I provide an alternate entry point into my package?

@WebReflection
Copy link
Author

WebReflection commented Jul 20, 2018

@zenparsing

it seems simpler to leave out the "main+.mjs" rule since ESM devs can simply use "module" instead.

agreed (and edited).

I think the main pushback has to do with deep package imports (e.g. import 'package/x').
Under this solution, how would I provide an alternate entry point into my package?

That's a module, and as such it needs to resolve package as ESM first, since you are using import which is for ESM only, and then resolve the rest through path.join(module-root, '...rest/of/the/path.m?js').
If that was a require('package/x') nothing would change.


edit please note import 'package/x' on the browsers would need a resolution through the mapped namespace proposal so that until that has shipped, I think my proposal would prefer import 'package/x.js', or better, an explicit subpath after the package entry.

That should also work with browsers once the mapped resolution lands.

@demurgos
Copy link

demurgos commented Jul 20, 2018

path.join(path.dirname(package.module), '...rest/of/the/path.m?js')

It'd be better if deep specifiers were always resolved from the root of the package (directory containing package.json). CommonJS always uses the root of the package, even if main is deep in a directory. Changing it for ESM is surprising (even if the proposed behavior is superior IMO). This leaves having files with different paths/basenames or using the extension.

@WebReflection
Copy link
Author

WebReflection commented Jul 20, 2018

@demurgos

It'd be better if deep specifiers were always resolved from the root of the package

absolutely, I've oversimplified the path.dirname(...) operation.

Changing it for ESM is surprising

no intent to change anything, it'd be path.join(module-root, '...rest/of/the/path.m?js') or whatever CommonJS is doing now to retrieve the absolute path of the package

This leaves having files with different paths/basenames or using the extension.

not my intention, apologies it was just a quick reply, you got it right what I actually meant.

@WebReflection
Copy link
Author

WebReflection commented Jul 20, 2018

@demurgos sorry I've edited a few times.

TL;DR the resolution is through the module root.
Whatever is already there would work fine and be the least surprise (nothing new to learn).


Dual Module Example

cjs/index.js
cjs/util/x.js
esm/index.js
esm/util/x.js
package.json

The package.json has {main: "cjs/index.js", module: "esm/index.js"} so that:

// import the default
import x from 'package';

// import subpaths
import x from 'package/esm/util/x.js';

The counter CJS equivalent would require('package') or require('package/cjs/util/x').

That would grant consistent structure. Although, for dual modules, I kinda wish the package resolution would have been from the folder of the entry point 😅

(but I guess users will end up just having ./c/ and ./e/ folders)

@michael-ciniawsky
Copy link

michael-ciniawsky commented Jul 20, 2018

Deep package imports would basically work the same way as importing a local file, the only difference is that a local import starts with './ || ../' etc while a deep package import starts with a bare specifier to trigger the node_modules resolution algorithm

import pkg from 'pkg/path/to/file.ext'
import file from './path/to/file.ext'

pkg.module could support being an {Array} to specify deep imports e.g

{
   "module": [
      "path/to/entry.ext", // Main (Entrypoint)
       { name: "path/to/lib/module.ext" }, // Deep Import
       { name2: "path/to/lib/module2.ext" }
    ]
}
import pkg from 'pkg'  // => path/to/entry.ext

import module from 'pkg/name' // => path/to/lib/module.ext
import module2 from 'pkg/name2' // => path/to/lib/module2.ext

import module3 from 'pkg/lib/module3.ext'  // => no pkg.module lookup
  • Explicit
  • Nearer to how package-name-map works
  • No export * from './lib/module.js' needed in the entry for e.g an 'extra' helper
  • Smaller Bundles (Size, Treeshaking, etc)

Still not fully convinced if it would be needed though... Also possible breakage with current pkg.module fields, but unlikely since using a {String} should continue to work for the 'simple case' (e.g a small package) { "module": "path/to/entry.ext" }

@WebReflection
Copy link
Author

That's great, I just think that maybe, instead of going nearer

Nearer to how package-name-map works

we can wait for standards to better define that and adopt the solution to avoid further conflicts or unnecessary churn

@michael-ciniawsky
Copy link

Yep

@frank-dspeed
Copy link

I think the solution is more simple esm-moduleName moduleName if esm-moduleName we simply get a package with a index.js or mjs or both a index.js with esm loader and a index.mjs without that.

with moduleName we get a cjs package with package.json and all this stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment