Skip to content

Instantly share code, notes, and snippets.

@creationix
Created June 27, 2012 22:19
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save creationix/445adc206d51a038de4a to your computer and use it in GitHub Desktop.
Save creationix/445adc206d51a038de4a to your computer and use it in GitHub Desktop.
ES6 module proposal

Modules Proposal

EDIT There is a second version of this proposal down in the comments. https://gist.github.com/445adc206d51a038de4a#gistcomment-359858

I really like Isaac's module proposal. Having recently designed a module system for the luvit project (node ported to lua), I've thought a bit about this.

I think we can set a few minimal rules and reduce a ton of edge cases. One in particular is cyclic dependencies. I don't think they are worth the pain and should simply be disallowed.

To help understand things, we should make a distinction between dependencies and peers. In a large application or package there are several files containing code. Some files depend on things defined in other files before they can start. Simple dependencies that block startup can't be cyclic or you'll have a deadlock at startup. So the other type of dependency is wanting to reference an api or function from another file during runtime.

The basic syntax I propose is like the one Isaac showed (if I understand it correctly). There are two new keywords, import and export. The import keyword must be followed by a string literal that is the identifier for the file you want to import. The export keyword is like return and yield in that it sends a value. It does not affect control-flow, but it does set the current file's export value.

Consider lib/add.js:

// lib/add.js
function add(a, b) {
  return a + b;
}
export add;

And then we have a peer file that depends on this:

// lib/usesAdd.js
var add = import "./add.js";
var result = add(2, 3);

Now suppose we wanted to write a math package that contains add as well as other functions.

// math.js
export {
  add: import "./lib/add.js",
  multiply: import "./lib/multiply.js",
  // ...
};

Here is where this gets tricky. Suppose that we want to implement multiplication using addition and a loop. So our multiplication library will need to depend on add. I can't depend on the main app because that depends on it. So it has to depend on add directly.

// lib/multiply.js
var add = import "./add.js";
function multiply(a, b);
  var product = 0;
  for (var i = 0; i < b; i++) {
    product = add(product, a);
  }
  return product;
}

But not we learn later on that add needs to access multiply for some reason. This happens all the time in real applications. How would this be required?

I really want dependencies with import and export to be expressible as a DAG (Directed Acyclic Graph). There is a clear order for loading the executing the files. There are two reasons for this. One, it makes a lot of nasty edge cases with current module systems simply go away. And second, JavaScript is single threaded. Even with the exports.foo = bar syntax in node and CommonJS, the dependencies have to be executed in some serial order. This may surprise people when you're allowed to require some module, but it's methods aren't populated yet because you were executed before it was. This would be a leaky abstraction.

So then how would I solve this dilemma using acyclic dependencies? I would handle it at the application level where I control the logic and know what to expect.

I would rewrite math.js as follows:

// math.js
var math = {};
export math;
math.add = (import "./lib/add.js")(math);
math.multiply = (import "./lib/multiply.js")(math);

Then in each module I would export a function that accepts the math namespace.

// lib/add.js
export function (math) {
  return function (a, b) {
    // ... might use math.multiply
  };
};

There are several other techniques that come to mind that I could employ for other use cases. The point is that using the basic primitives that would be provided by the language, I can organize my app however I want. My code would use the same style in the browser or in node. The browser could pull my code on demand as it's needed or a server-side compiler could statically analyze the imports and generate a single js file that the browser then reads. All JS code everywhere would be written using the same simple module syntax, but without imposing huge constraints on the runtime or providing leaky abstractions to applications.

@johnjbarton
Copy link

Let me just point out some social-interaction reality (despite my complete lack of expertise in that area ;-):

The ES module proposal exists and represents a piece of work by the ES team. They are not likely to throw it over for a gist proposal, no matter how thoughtful. Instead, any new proposal has to make contact with and be compared to the ES proposal.

Similarly CommonJS and RequireJS went through a very large number of iterations and a great number of efforts at unification. So any new proposal has to explain how it differs from these efforts. The most obvious way would be a reliance on first-class parsing; simply a 'better' API is not adequate. This experience also tells that that any proposal that fails to support asynchronous loading will fail. I think Tim's proposal for example includes these issues but it call it out and do the comparison.

(Just to be clear: I'm not pointing at Tim and I am trying to be encouraging: this effort is important!).

@creationix
Copy link
Author

@johnjbarton I'm aware, thanks. This thread contains my 1/3 baked and 2/3 baked ideas. I'm still bouncing ideas. Only once I know what I want do I go through the trouble of trying to convince others and gain consensus. Get it right first, then worry about being right.

Also, while this is just a gist, it's the result of many years of working with this problem.

I'm willing to make a proper proposal and address everyones concerns through the proper channels if that's desired. I'm just not ready for that yet.

@creationix
Copy link
Author

@bmeck. Dynamic require can be done in my second proposal. Here is a sample that loads all modules in a folder and stores their exports in a big object.

var controllerDir = __dirname + "/controllers";
var controllers = {};
fs.readdirSync(controllerDir).forEach(function (filename) {
  var fullPath = path.join(controllerDir, filename);
  var contents = fs.readFileSync(fullPath, "utf8");
  Loader.define(fullPath, contents); // I assume node's Loader.define will use sync I/O and so don't need a callback
  controllers[file] = Loader.require(fullPath);
});

This is fully integrated into the system and any requires within the controllers/* files will work as normal.

@creationix
Copy link
Author

@dherman Here is an example of cyclic dependencies using my second proposal:

Here is even.js:

var odd = require('./odd.js');
return function even(n) {
  return n == 0 || odd(n - 1);
};

And odd.js:

var even = require('./even.js');
return function odd(n) {
  return n != 0 && even(n - 1);
};

And then I kick off the process by compiling this code:

// main.js
var odd = require('./odd.js');
console.log(odd(5));
  • Before compiling we have a clean state. Loader.preload is {} and so is Loader.loaded.
  • The runtime loads main.js and calls Loader.define("/main.js", ..., callback)
  • Inside define, the text is parsed and a dependency on "./odd.js" is discovered.
  • This compiled half-baked function is stored at Loader.preload["/main.js"]
  • Loader.resolve("/main.js", "./odd.js", callback) is called which outputs "/odd.js".
  • Loader.fetch("/odd.js", callback) gets the text for that module
  • Then Loader.define("/odd.js", ..., callback) is called.
  • Inside the text, it finds the "./even.js" dependency.
  • This half-baked function is stored at Loader.preload["/odd.js"]
  • Then "./even.js" is resolved, fetched, and Loader.define("/even.js", ..., callback) is called.
  • When the code is parsed, the only dependency found it "/odd.js", but we already have an entry for that in the preload table, so we're done.
  • The compiled function is stored in the preload table.
  • The async stack unwinds through chained callbacks. At each level it fully bakes the existing functions so they know the resolved paths.
  • Finally the callback for the initial Loader.define("./main.js", ..., callback) is called.
  • The system then executes the initial code snippet.
  • When require("./odd.js") is encountered, a call to Loader.require("/odd.js") is made since this function is now fully baked.
  • This call will execute that module.
  • While executing that module, the require("./even.js") will be evaluated which causes that module to be executed.
  • "odd.js" never finished running and so Loader.loaded["/odd.js"] is still empty, so ....

Dang, I found a bug in my proposal. This could be solved using a Proxy object that passes through to the real export value once it's known, but I'm hoping for a more elegant solution.

@creationix
Copy link
Author

Though if I required within the function bodies, it would resolve just fine. It's the cyclic dependencies before exporting/returning values that's problematic. And since the require's values are resolved to full identifiers at compile time, there is little overhead in calling require in the loop.

// even.js
return function even(n) {
  var odd = require('./odd.js');
  return n == 0 || odd(n - 1);
}

Simple things made simple and complex things made possible perhaps? Plus this has the added benefit that "odd.js" never gets executed if it's never needed.

@johnjbarton
Copy link

@kriskowal said:
"E: Module blocks are an independent issue. My opinion is that they provide very little value. @dherman likes to draw small circles around bits of code in files. I prefer small files. It’s a value-call. Given that "module" is proposed to be a contextual keyword that would not interfere with my "module" variable names, I could happily ignore their existence, but I would be even happier knowing they don’t exist."

Personally I like one-file === one module; I don't like all the extra syntax. However I don't understand how we can create a scope without new syntax. Top-level file scope is already defined in JS: it's global scope.

@johnjbarton
Copy link

@creationix says:
"Oh, and since the only syntax change is a new require() expression, this can easily be polyfilled in today's JavaScript."

I don't understand how this can work. Simply adding unconstrained "require()" would just be a hack along the lines of Burke's require.js scanner for require. What does a programmer think when the read:
var imgs;
if (retina) {
imgs = require("daHot.js");
} else {
imgs = require("elCheapo.js");
}
?

Beyond the dynamics, the fundamental way we resolve cyclic dependency is to create a two-pass or two-phase solution where one phase builds references filled in by the second one. The first phase can be based on declare-before-use and ordering of statements or declare-before-use and a syntax that implicitly orders (like JS today). How can unconstrained require() provide that "before-use" thing if it is executeable?

@kriskowal
Copy link

I would like to attempt to illustrate how @dherman’s proposal might handle mutual dependencies, in its defense. As @johnjbarton points out, Traceur hasn’t done this yet, so this is my speculation.

Mutually dependent modules would get transpiled into a “working set” with a shared scope. Imports and exports would be hoisted to this shared scope.

odd.js

import odd from "odd.js";
export function even(n) {
  return n == 0 || odd(n - 1);
};

even.js

import even from "even.js";
export function odd(n) {
  return n != 0 && even(n - 1);
};

main.js

import odd from "odd.js";
console.log(odd(5));

transpiled.js

// hoist even.js#even as even in shared scope
// hoist odd.js#odd as odd in shared scope
(function (even, odd) {
    // from even.js
    var evenFactoryCalled;
    function evenFactory() {
        // memoize
        if (evenFactoryCalled) {
            return;
        }
        evenFactoryCalled = true;
        oddFactory(); // call site of import
        even = function even(n) { // call site of export
          return n == 0 || odd(n - 1);
        };
    }
    // from odd.js
    var oddFactoryCalled;
    function oddFactory() {
        if (oddFactoryCalled) {
            return;
        }
        oddFactoryCalled = true;
        evenFactory();
        odd = function odd(n) {
          return n != 0 && even(n - 1);
        };
    }
    // from main.js
    var mainFactoryCalled;
    function mainFactory() {
        if (mainFactoryCalled) {
            return;
        }
        mainFactoryCalled = true;
        oddFactory();
        console.log(odd(5));
    }
    mainFactory();
})();

Consider an alternative with single-value exports/imports. I've removed the names of the exported functions to clarify that the name does not apply to the exported symbol, but if they were retained, they would simply be the name of the function.

odd.js

import "odd.js" as odd;
export function (n) {
  return n == 0 || odd(n - 1);
};

even.js

import "even.js" as even;
export function (n) {
  return n != 0 && even(n - 1);
};

main.js

import "odd.js" as odd;
console.log(odd(5));

The transpiled form would be identical to the previous example.

Note that this cannot be done with let or var, because the declaration must hoist to a different scope.

This alternative impacts destructuring multiple exports.

With @dherman's proposal, each individual exported name populates a variable in shared scope. Formally ignoring nested module name-spaces for a moment, destructuring in an import clause does not occur at the site of the import clause, but at the site of the corresponding export.

mathy.js

export even = function (n) {
    return n == 0 || odd(n - 1);
};
export odd = function (n) {
    return n == 1 || even(n - 1);
};

main.js

import {even, odd} from "mathy.js";
// even and odd hoisted from mathy.js
(function (even, odd) {
    function mathyFactory() {
        even = function (n) {
            return n == 0 || odd(n - 1);
        };
        odd = function (n) {
            return n == 1 || even(n - 1);
        };
    }
    var mainFactoryCalled;
    function mainFactory() {
        if (mainFactoryCalled) {
            return;
        }
        mainFactoryCalled = true;
        mathyFactory() // call site of import {even, odd}
        console.log(odd(5));
    }
    mainFactory();
})();

Note that there is no destructuring assignment in the transpiled output. The destructuring occurs when the working set of modules is compiled, and each of the variables in the structure is mapped to a variable in the shared scope.

Consider the equivalent with single-value exports. This example applies to both return or export syntax.

mathy.js

var even = function (n) {
    return n == 0 || odd(n - 1);
};
var odd = function (n) {
    return n == 1 || even(n - 1);
};
return {even, odd};

main.js

import {even, odd} from "mathy.js";
console.log(odd(5));

transpiled.js

// even and odd hoisted from main.js instead of mathy.js
(function (mainEven, mainOdd) {
    function mathyFactory() {
        var mathyEven = function (n) {
            return n == 0 || odd(n - 1);
        };
        var mathyOdd = function (n) {
            return n == 1 || even(n - 1);
        };
        // site of return {even, odd}
        // for each corresponding import site:
        {mainEven, mainOdd} = {mathyEven, mathyOdd}; // main.js
    }
    var mainFactoryCalled;
    function mainFactory() {
        if (mainFactoryCalled) {
            return;
        }
        mainFactoryCalled = true;
        mathyFactory() // call site of import {even, odd}
        console.log(mainOdd(5));
    }
    mainFactory();
})();

The rules are slightly different. With single-value exports, we must execute destructuring for each corresponding import at the location of the export.

In both cases, the shared values stabilize after all involved factories have finished executing. Also noteworthy that single-value-exports does not need knowledge at compile time of the shape of the exports object. It also does not need to track at compile-time the boundary between static members and dynamic members.

Left as a future exercise, imports that are commuted to exports. These probably do not work so well with single-value-exports, but fine with multi-value-exports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment