bmeck/proposal-no-path-searching.md

## proposal-no-path-searching.md

      
    Raw
  

              proposal-no-path-searching.md
            
          
    Problem

There has been no progress in working towards a single cohesive story for path resolution between Servers and Web.
Notable discussion points relevant to this are:

Node has a path searching algorithm.
Web has not been able to gather support for any of the following:

Build tooling as part of UX expectations (lack of interest)
Smarter static web servers (lack of interest). PoC example at https://github.com/bmeck/esm-http-server
A resolve based hook. (interest shown with desire for ~6 months of userland experimentation)

This will be assumed to exist under the existence of whatwg/html#2640 , URL.createObjectURL, and Service Workers.
Rudimentary PoC (without actual integration) at https://github.com/bmeck/browser-hooking without a ServiceWorker. SW example at https://github.com/bmeck/node-sw-compat-loader-test.


Node now has hooks for import

Node has existing requirements for per-package hooks.
Per-package hooks can be used to give a migration process towards people using features that the web is deficient in.


This proposal would seek to remove searching for index files and file extensions.
This proposal would seek to remove searching for package.json#main when importing resolves to a directory.
This proposal seeks to define a loader hook that can be encourages to be used per-package that adds the behaviors it wishes to remove.
This proposal does not seek to remove .mjs from being the canonical authorship of ESM.
Per-package loader hooks

The underpinning assumption of this proposal is a strong support for per-package loader hooks.
This is a definition of the capabilities and a bikeshed for how to achieve them.
Scope of hooks

Hooks must be confined to a well defined subsection of the URL space (fs) used by import.
This proposal will define the boundaries of subsections to be:

A directory containing package.json will have a termination when crossing the directory.

Given the fs of:
/path-searching-hook
/foo
  /package.json
  /bar/example.mjs
/a
  /package.json
  /a.mjs

// /foo/bar/example.mjs
import '../' // does not cross boundary by resolving to `/foo`
import '../..' // does cross boundary by resolving outside of `/foo` to `/`
// /a/a.mjs
import '../foo' // does cross boundary by resolving out of `/a`
import '../foo' // does cross boundary by resolving to `/foo`
Consumer and Author negotiation


It must be possible as a consumer to affect the path resolved within another package's scope.
It must be possible as a author to affect the path resolved within the author's package scope.

In order to avoid recursive boundary crossing in one step, all paths will be resolved in two phases.

External resolution that is resolved by consumers from a different package scope.
Self resolution that is resolved by the package scope containing the resolved path.

// /a/a.mjs
import('/foo');

// 1. fires /a 's package scope loader hooks, seeing `/a/a.mjs` as source and `/foo` as specifier
// lets assume it resolves to /foo
// 2. fires /foos 's package scope loader hooks, seeing `/foo` as source and `./` as specifier
Declaration of hooks

Per package loader hooks can be declared in a package.json file as a specifier to find using the globally defined resolution algorithm.
Global hooks may affect this resolution, but package hooks may not.
This allows code coverage, instrumentation, etc. to access package hooks.
{
  "name": "foo",
  "hooks": "../path-searching-hooks"
}
This also allows the hooks to exist outside of package boundaries. This file when loaded as a loader will be in a separate Module Map space from userland and only has the globally defined resolution algorithm.
Types of hooks


only a resolve hook. use URL.createObjectURL or alternatives like Service Workers if you need to modify source.

On the nature of static resolution

ESM is able to link statically and there should be a path to allow static / ahead of time usage of per package hooks.
By only having a single resolve hook, paths can be rewritten and observed to do in-source replacement.
This is problematic however, since URL.createObjectURL lives in memory.
Usage of such APIs on platforms without writable fs like Heroku should have a path forward for these hooks.
I recommend a combination of V8's SnapshotCreator when possible, and a flag to allow rewriting URL.createObjectURL reservations to a location on disk.
Problem, multiple boundary crossing

/root
  /package.json
  /entry
    /package.json
  /dep
    /package.json

If entry were to import('../dep'). It would be handled in the typical entry hooks then dep hooks manner. This does not give root a chance to intercept the imports.
This is seen as a suitable limitation since root is presumed to have ownership of entry and dep's source code by them existing within its directory. Edit the entry and dep packages as needed in order to achieve hooking that goes through root's use cases.
Composition

Hooks should have a means by which to achieve composition. This is needed for cases of multiple transformations. A package might seek to call a super of sorts to get the result of a parent loader, and it may seek to do the exact opposite as a guard to ensure expected behavior.
Loaders therefore need to have a concept of a parent loader hooks to defer to, or to ignore.
Changing hook allocation to be done using new and providing the parent as a paremeter is sufficient for this:
#! node --loader
module.exports = class LogImports {
  constructor(parent) {
    this.parent = parent;
  }
  async resolve(url) {
    debugger;
    const ret = await this.parent.resolve(url);
    console.log(url, 'became', ret);
    return ret;
  }
}
Example use cases for composition


Code Coverage
Instrumentation such as APM
Mocks/Spies in testing frameworks
Logging/Debugging
Compilation
Linting
Isolation (such as with code signing)

Isolation

Hooks that are composed still are isolated by per-package boundaries. Nested packages will not fire the parent loader hooks unless they cross into a package boundary with those hooks.
Passing arbitrary data between instances can be problematic for both isolation and threading. Therefore the only data passed between instances of loaders will be transferables or primitives.
The parent passed to the constructor of a loader will be a limited facade that only shows white listed proprties and calls the relevant method on the true parent instance. It will ensure errors are thrown if given improper arguments length and/or non-transferable data.
Per-package composition

Can be achieved by manually constructing the chain inside their per-package hook code.
Global composition

Can be achieved by providing multiple --loader flags. This allows for better debugging when development loaders need to be added.
npm start
# => node hasErrors.mjs
# aborts
export NODE_OPTIONS='--loader DebugImports'
npm start
# will log imports if HasErrors defers to the parent loader
Ignoring parents

In certain scenarios a package may need to ignore the parent loader. In those situations the hooks will be unable to defer to the default global behavior of the process, which may provide debugging behavior such as logging/code coverage/linting/etc.
For now escape hatches are punted on this design space to userland, but it is recommended that when using NODE_ENV=development or NODE_ENV=test all loaders defer to the parent loader.
Code signing invariant implications

Mutating the code loaded in a code signed bundle is problematic. Integrity checks of unexpectedly mutated imports should fail. This area needs more research.
Future research

Given the problems of ignoring scripts and code signing being unable to easily defer to parent loaders more design needs to be done around development workflows. Inspector tooling is the recommended approach. This may mean adding special hooks to inject loader hooks during development via a flag such as --inspector-loader-hooks=LogImport that may fire before per package hooks but ensures the inspector is running.