Cleod9/es20XX_package_keyword.md

## es20XX_package_keyword.md

      
    Raw
  

              es20XX_package_keyword.md
            
          
    ES20XX package Proposal


Preface
What is a package?
What is the point of package?
How could package be used in JS?
Syntax Proposal
Considerations

Preface

Potential future of JS:

This proposal covers a possible implementation of the package keyword for the ECMAScript spec. JavaScript is constantly evolving and is being used more frequently for full-fledged application development. The package keyword has the potential to provide a more traditional application structure to the language, which will be explored below. Such structure has been battle tested for decades across other high-level languages such as Java, C#, ActionScript, etc, and by applying it to a JS environment it could help provide a clear-cut standard for large scale applications.
One thing to note is that the ES2015 (ES6) module specification was seemingly designed with the notion that browsers will eventually add asynchronous support for loading module dependencies dynamically. Conversely, JS developers on the web often find themselves bundling source code as opposed to asynchronously loading every individual module. The package symbol could potentially provide a more bundle-centric way of writing JS that could work on both the client and server side. With this mindset, the package code itself still exists as a bundle of "modules" while in development but with the end goal of being part of some larger, single application.
(Disclaimer: This proposal is in no way geared towards discouraging or replacing traditional module syntax. Its primary goal is to provide an extended module-like syntax via the package keyword that is aimed at large-scale JS application development where the source code for the project will be completely self-contained at runtime)
What is a package?

Packages are commonly used in other languages such as Java and Flash ActionScript 3.0 (and even C#'s namespace keyword to some extent) as the primary mechanism to organize code into relevant groups. Generally they are organized in a fashion that is representative of the location of source code files on the system disk. Packages heavily resemble modules, and the two concepts are often used interchangeably - however they are quite different in nature depending on the language.  Since ActionScript 3.0 (AS3) is the most syntactically similar language to JavaScript that uses the package keyword, it will be referred to throughout upcoming examples.
In AS3, package names are directory paths with folders delimited by the . symbol, most often in a reverse-domain naming convention (e.g. ./com/example becomes com.example ). In an AS3 application you begin by defining a list of these directories to be searched recursively for referenced source files prior to compile-time. Generally this is just the path ./ or ./src, but you can have as many as you'd like. You can imagine each of these source paths being "merged" together at compile time with the first insance of a unique file path always taking priority. Each folder in the hierarchy can be considered a package, and each file within the hierarchy exposes single public class (see below):
/* ./MyClass.as */
package {
  // Code here belongs to the root package level
  public class MyClass { /* ...*/ }
}
/* ./com/example/MyOtherClass.as */
package com.example {
  // This class belongs to the package "com.example"
  public class MyOtherClass { /* ...*/ }
}
You could picture the hierarchy of the above packages in memory like this:
{
  "MyClass": [Class Definition]
  "com": {
    "example": {
      "MyOtherClass": [Class Definition]
    }
  }
}
But since packages are more or less just strings (a.k.a. "class paths), you can flatten this structure even further like so:
{
  "MyClass": [Class Definition],
  "com.example.MyOtherClass": [Class Definition]
}
The flattened example is the foundation of how the package keyword would work in JavaScript, which will be detailed below.
What is the point of package?

One big advantage of the package keyword is that it provides a foundation for code structure that encourages the separation of dependency linking and actual scripting. For example, in AS3 you are limited to a very small set of operations when outside the scope of the class keyword (and in Java you have even more restrictions!). As such, in AS3 it is common practice to reserve this space for import statemements only:
package {
  import com.example.ClassA;
  import com.example.ClassB;
  import com.example.ClassC;
	
  public class Main { /* ... */ }
}
At first this might look similar to module syntax, however the key difference is that no core application logic exists outside of the class keyword. As a result, the compiler can handle dependency linking during a separate step prior to application initialization. This is a crucial difference from traditional JS modules, which will evaluate the entire source file as soon as the imports are resolved. We can accomplish something similar in JavaScript but with a few additional allowances, which will be described later on.
How could package be used in JS?

With the package symbol, it would be possible to write JS source code with several notable features:

Consistent dependency references (no more relative paths, no need for webpack alias/modulesDirectories)
Simplified bundling process (less complexity)
Improved static analysis (foundation for better code introspection in future JS tools/IDEs)

Consistent dependency references

Let's take a look what a typical CommonJS module would look like with it's dependencies listed at the top:
var ClassA = require('../folderA/ClassA');
var ClassB = require('./folderB/ClassB');
var ClassC = require('../../folderC/ClassC');

export default class MyClass {};
While this code might be pretty straightforward, it happens to mask details about where this particular file lives in the application structure. Imagine if we had an application with 200 different class files each in different directories. Each one of these files would have its own variation on the relative path of ClassA, ClassB, and ClassC. This can make refactoring very frustrating if you lack an IDE/tool with CommonJS-specific introspection. Currently this issue is typically solved at the bundler level, as it is not a feature of the language (e.g. Webpack alias/moduleDirectories)
Let's transform it to ES6 syntax next:
import ClassA from "../folderA/ClassA";
import ClassB from "./folderB/ClassB";
import ClassC from "../../folderC/ClassC";

export default class MyClass {};
Now we have fewer characters, but the same problem exists from before since it still uses relative file paths.
With packages, this could be transformed to:
package com.example.app {
  // Note: Let us assume the package keyword would affect the behavior of "import"
  import com.example.folderA.ClassA;
  import com.example.app.folderB.ClassB;
  import com.example.app.folderC.ClassC as ClassC; // Aliasing
  // import { ClassC } from com.folderC; // Potential long-form version for the above syntax to support aliasing
  import com.folderD.*; // Potential to implicitly import all public symbols

  // Note: "import" always hoisted to the top regardless of exports
  export class MyClass {};
}
The advantage here is that all files can reference a unique dependency with the same path. Each class is able to have a unique fully qualified domain name (FQDN), so importing a dependency at runtime is just a matter of evaluating the package names as if they existed in an implicit global object hash. This makes things like manually refactoring file structure a much simpler process, since you don't have to deal with different relative paths for every dependency in every file. It also opens up the door to the much contested Pythonic wildcard * symbol without conflicting with *'s current implementation for import syntax.
Simplified bundling process

Bundling source code for packages would be simplified since there is a special behavior for import statements to assist with statically analyzing a project's dependencies. It is only once you actually instantiate one of your classes that your application logic needs to execute. As a result, the bundling process becomes a breeze since dependency linking can occur before even a single line of application logic. For example:
// File A (Main)
package {
  // Bonus: This type of syntax could be implicit for classes within the same package
  // import Child;
  // import Parent;
  // import Grandparent;

  export class Main {
    var child = new Child();
    var parent = new Parent();
    var grandParent = new Grandparent();
  }
}
// File B (Child)
package {
  // import Parent;

  export class Child extends Parent {}
}
// File C (Parent)
package {
  // import Grandparent;
  export class Parent extends Grandparent { }
}
//File D (Grandparent)
package {
  class Grandparent { }
}
Under normal circumstances, the execution order of the above code would be crucial.  Since File B depends on File C, and File C depends on file D, traditional import syntax would necessitate that these files are evaluated in the correct order. But with the package keyword, you could potentially load these files in any order since it is not necessary to evaluate their post-import logic immediately. You can just concatenate these files, evaulate the package names and hoisted import statements, and then provide a mechanism for unpacking and instantiating the entry point. By the time the entry point has started you have already built a complete dependency graph.
The mechanism to accomplish this in a module bundler is described in the next section. Note that for native support of package in the browser, the JS engine would be augmented in a similar fashion to support loading a bundled package.
Syntax Proposal

Below is the syntax proposal for the package keyword in JS:
// Package files should have their package names parsed first, followed by import resolution
// Acceptable values for package names could be (.)-separated symbolic names or ES6-like destructuring with {}
// (Aside: Package paths should be valid file name characters only)
package com.example {
  // Package gets its own module-like scope

  // Import statements would be hoisted but only in the scope of the package, and not be evaluated until the file is loaded
  // (Also aliasing can come for free via "as" keyword)
  import com.example.ParentClass;
  import com.example.foo.ClassA;
  import com.example.foo.ClassB;
  import { ClassC } from com.example.foo; // Potential ES6-ish style import
  // import com.example.foo.ClassC as ClassCRenamed; // Aliasing example
  // import { ClassC as ClassCRenamed } from com.example.foo; // ES6-ish aliasing example

  // Consider here and below the "module", where normal ES6 module syntax is allowed
  // Note that ParentClass's import will receive special treatment (outlined in the shim) to ensure immediate import
  export class Main extends ParentClass {
    // Normal class syntax here
    super();
    console.log("Inside entry point");
  }
}
// Note: Re-declaring the same packages will append to the definition already created for the package
package com.example {
  export class ParentClass {};
}
package com.example.foo {
  export class ClassA {};
}
package com.example.foo {
  export class ClassB {};
}
package com.example.foo {
  export class ClassC {};
}
ES5/ES6 Polyfill for this could resemble the below code:
(Note: This has been adapted from the library AS3JS and is a fully-functional proof of concept. To see it in action you can copy the below code right into Traceur or Babel)
// "packaged" source code converted to ES5
var entry = "com.example.Main";
var program = {
  "com.example.Main": function(module, exports) {
    /*** Each package file is treated similarly to a normal module ***/

    // Parent is needed immediately for the sake of "extends"
    // (a static analyzer could detect imported names at the top-level package scope to determine this)
    var ParentClass = module.import('com.example', 'ParentClass');

    // These classes are not immediately needed
    var ClassA, ClassB, ClassC;

    // Special injection function will defer dependency resolution for others
    module.inject = function() {
      ClassA = module.import('com.example.foo', 'ClassA');
      ClassB = module.import('com.example.foo', 'ClassB');
      ClassC = module.import('com.example.foo', 'ClassC');
    };

    // Note: Shimming the class to basic ES5 here
    var Main = function() {
      ParentClass.call(this);
      console.log("Inside entry point");
    };

    Main.prototype = Object.create(ParentClass.prototype);

    module.exports = Main;
  },
  "com.example.ParentClass": function(module, exports) {
    var ParentClass = function ParentClass() {};

    module.exports = ParentClass;
  },
  "com.example.foo.ClassA": function(module, exports) {
    var ClassA = function ClassA() {};

    module.exports = ClassA;
  },
  "com.example.foo.ClassB": function(module, exports) {
    var ClassB = function ClassB() {};

    module.exports = ClassB;
  },
  "com.example.foo.ClassC": function(module, exports) {
    var ClassC = function ClassC() {};

    module.exports = ClassC;
  }
};

// Some temps / helper
var i, j, tmpPkg;
var getPackageInfo = function ( name ) {
  // Splits package path into separate package and class name
  var pkg = name.split('.');
  var className = pkg[pkg.length-1];
  pkg.splice(pkg.length-1, 1);
  var packageName = pkg.join('.');

  return { 
    packageName: packageName,
    className: className
  };
};

// This hash map contains each package, each package contains its classes
var packages = {};

// Converts supplied package hash to packageName.className.{ source: moduleFn }
for (i in program) {
  tmpPkg = getPackageInfo(i);
  packages[tmpPkg.packageName] = packages[tmpPkg.packageName] || {};
  packages[tmpPkg.packageName][tmpPkg.className] = { compiled: false, source: program[i] };
}

// This helper will execute the module source specified by "name" and return its exports object
var imports = function ( packageName, className ) {
  // Only run source() if it hasn't been compiled yet
  if (!packages[packageName][className].compiled) {
    packages[packageName][className].compiled = true;
    packages[packageName][className].module = { exports: null, inject: null, import: imports };
    //This next line actually compiles the module
    packages[packageName][className].source(packages[packageName][className].module, packages[packageName][className].module.exports);
  }
  // Returns the compiled module
  return packages[packageName][className].module.exports;
};

// Compiles all packages
for (i in packages) {
  for (j in packages[i]) {
    imports(i, j);
  }
}

// Run inject() functions as the final step (this trivializes circular dependencies)
for (i in packages) {
  // Execute the injection functions
  for (j in packages[i]) {
    if (typeof packages[i][j].module.inject === 'function') {
      packages[i][j].module.inject();
    }
  }
}

// Initializes application
// (Note: This demonstrates the potential for both static and `new`-able entry points)
var entryPkgInfo = getPackageInfo(entry);
var entryPoint = imports(entryPkgInfo.packageName, entryPkgInfo.className);

// Insantiate entry via new
/* return */ new entryPoint();
// /* return */ entryPoint; // Or potentially even return static entry
Considerations regarding the package keyword and this polyfill are described below.
Considerations

New <script> tag attribute

Given the source code for a fully package-based JS application that has been bundled and written to app.js, it could potentially be loaded as follows:
<script type="package/javascript" src="app.js", entry="com.example.Main"></script>
This file would be evaluated in such a way, where all of the package names and imports are evaluated before any exports are created. This would follow the same logic as the shimmed ES5 code outlined later below. Then the entry attribute would be split by the dot symbol, where the right-most text would be the class name, and the remaining left-hand side would be the package name. The browser would then automatically call new on the specified entry point once the previous steps have been completed.
Package names as flat strings VS hierarchy

The dot (.) symbol suggests that packages might be stored under memory in a tree-like fashion, however by storing it as a string it prevents directory-file conflicts.
For example, if packages were represented in memory as a tree-like object, you would have a naming conflict if given the names com.example.ClassA and com.example.ClassA.SubClassA. While the file structure might allow for a file called "ClassA.js" and a folder called "ClassA" within the same directory, the package structure would not. By using a simple string hash for the package name instead, this completely disassociates the package name from the name of the actual file it came from.
It's probably unlikely that this file+folder naming conflict would occur in a real project, but package names being treated this way gives developers one less nuance to worry about.
Package class naming collisions

In Java and AS3 it is possible to run into name collisions when importing packages. For example, in AS3 if you had a class called "Foo" that existed in the package "com.example", any other packages that contained a class "Foo" would require you to import via the fully-qualified class name in order to avoid multiple "Foo" classes in the same package hash. The package keyword for ES gets around this through standard import syntax, which is as simple as this:
import { MyClass as MyAlias } from com.example
This would be one advantage of JS packages when compared to other languages that don't allow aliasing in this way.
Allowed exports

Only classes and functions would be allowed exports for packages (i.e. only "new-ables" exportable). Enforcing this limitation would encourage developers to use the class keyword when possible, but still allow for package-level functions. The default keyword for exports would be prohibited.
Multiple Declarations of the Package Name

In the above examples, each individual export would be wrapped in its own package block, even though they are technically the same package:
package com.example.foo {
  export class ClassA {};
}
package com.example.foo {
  export class ClassB {};
}
package com.example.foo {
  export class ClassC {};
}
Each package name is essentially a hash value to a list of definitions, so when the interpreter encounters a package name it should first see if an existing hash exists with the same name. If so, the class name should be stored in that package hash. Otherwise a new package hash should be created.
Potential for Implicit Imports for same-package values

For developers coming from other languages, it might be heplful to have implicit imports of packages that live in the same package hash. For example:
package com.example {
  export class A {};
}
package com.example {
  export class B {
  	 constructor() {
		// Implicit 'import com.example.A' had occured
		console.log('I know about A: ', A);
     }
  };
}
This is not necessarily a must-have feature, but should be something to consider as it would significantly reduce the amount of imports at the top of a package.
Wildcard Import for same-package values

Similar to the previous consideration, there could also be a new wildcard symbol * for bulk importing other package contents. For example:
package com.foo {
  import com.foo.*; // '*' will be evaluated to "{ FooA, FooB, FooC }" etc.
  // import * from com.foo; // Long form of the above
  export class A {};
}
These types of imports are typically discouraged in languages like Python, however they can be useful in Java or AS3 when there is a massive amount of imports coming from a single package.
Asynchronously Loading Regular Modules From Within A Package

It is possible to allow the asynchronous loading of external ES6 modules. When the interpreter encounters an import statement with a string instead of symbolic path, the code could be fetched independently before marking the package as "loaded". For example:
package {
  import $ from "./jquery.js";
  /* .... */
}
The interpreter should detect "./jquery.js" as a URL, and has the opportunity to defer the execution of a application's entry point until that module is loaded.
Asynchronously Loading Other Packages From Within A Package

Theoretically packages could perform similar logic as the previous consideration, where the dot (.) in the package path is replaced with slashes, and the import falls back to an asynchronous load relative to the application directory. But the point of packages is to  appeal to the idea of "bundled" code when there is no anticipation of loading separate parts of the application independently, so it may be best to restrict this type of loading to non-packaged code. If the need arose to load another package asynchronously, it would likely be in the context of another indepedent application that could be set up via injecting a new <script> tag in the browser, or performing a require() in Node.js
Dealing With Multiple Package Bundles in the Same Page/Session

A decision would need to be made regarding whether or not a loaded package bundle should share package name access with other packages. Sharing the space has implications that packages should be able to overwrite the definitions of other packages, so it would probably make the most sense to give individual packages separate storage and a unique scope. In a Node.js environment this would ensure requiring a module from the node_modules folder would not collide with the packages of the parent project, and in the browser packages could be loaded without worry of tampering.
Considerations for Node.js

So far most of the information outlined in this page have been regarding JS in the browser. For environments like Node.js, much of the functionality would likely need to be taken care of at the configuration level. For a package-based project, it would probably make the most sense to define them as a property in the project's package.json file:
{
  "package": {
    "main": "com.example.Main",
    "paths": [ "./src" ]
  }
}
It wouldn't have to be structured exactly like this, but this is just to demonstrate how you might inform Node that a project is to be loaded as a series of packages instead of regular modules. You would simply assign an entry point and a list of base directories. Since Node.js isn't concerned about bandwidth, the dot separator in the package name could be treated like a slash, and packages could be located dynamically by traversing relative to each of the source paths provided.
Additionally when require() is used to pull in another package-based module dependency from node_modules/ (or if ES6-style package imports are used), the received value would be the result of new OtherProject() determined by its entry point specified by main.
Differences From The TypeScript "module" Keyword

The closest implementation to the package keyword currently in the JS world is the module keyword from Typescript, which can act as a mechanism to split a module's exports across several files. This behavior is extremely specific to TypeScript, and can appear unclear to newcomers to the language since it extends the idea of what a module is. The package keyword is quite different since it presents a more concrete terminology for how such a structure could be applied in practice, which could alleviate some of the confusion when figuring out how to import a package's dependencies. Since the package keyword is geared towards specifically building bundled code, there doesn't have to be any significant accommodations for importing modules. As long as all of the packages are loaded then the dependencies can be reconciled at runtime with little effort.
Differences From a "namespace" or Otherwise Typical Module (Circular Dependencies)

The package keyword is not to be confused with regular modules, and especially not the namespace keyword (see  TypeScript's). While the concepts are similar, the package keyword differs greatly in that it prioritizes dependency resolution before actual code execution. It is not merely a wrapper for namespacing code. This is evident in the usage of the module.inject() concept that is used in the shimmed example from above. A transpiler should examine the top level scope of a module for symbols used from an import statement, and if and only if they are found will they be imported immediately. Otherwise the import will be deferred with help from module.inject() until the final step right before new EntryPoint() is called. This results in the ability to handle circular dependencies in a way similar to AS3 or Java, where as long as the properties aren't used in the package scope they can safely resolve themselves. See the below example using CommonJS ES6 syntax (this would apply to ES6 and TypeScript's import as well):
// File A.js
var B = require("B");
module.exports = class A () {
  constructor() {
	console.log(B);
  }
};
// File B.js
var A = require("A");
module.exports = class B () {
  constructor() {
	console.log(A);
  }
};
This example is an obvious circular dependency that even changes depending on load order. But statically we can see these files don't necessarily need one another until someone instantiates one of the classes. With the package keyword, this would be allowed to pass:
// File A.js
package {
  module.exports = class A () {
    constructor() {
      console.log(B);
    }
  };
}
// File B.js
package {
  module.exports = class B () {
    constructor() {
      console.log(A);
  }
};
The reason this could work is because the module.inject() concept allows the package keyword to defer any post-import execution until all of the dependencies have been resolved. As long as any imported symbols are not utilized in the top scope, it would be safe to defers its import until all other dependencies have been resolved.
(Note: This is certainly not to promote circular dependencies which are discouraged in practice, but rather demonstrate how they could in fact work with the right coding patterns)
Future of JS w/ Types

If the spec ever achieves true type annotations, some considerations would need to be made regarding imports for types. For example, TypeScript will exclude any imports in its output code that were only used for types. See the following:
import B from "./B";

var foo = function (b:B) {
  console.log(b);
};
In the above example, the output JS from TypeScript would exclude the import for symbol B since it doesn't exist after the types are stripped. This is actually desired behavior. To ensure an optimized output file and the best IDE support with the package keyword, imports that are only used to define types should not ever be transpiled. Otherwise they could introduce unnecessary circular dependencies or other dangerous side-effects outside of the developer's intentions.
Also note that with the addition of a new wildcard * symbol for importing packages, bulk importing types without separate (or global) type definitions could be done in much fewer import statements.
Identifying Packages As Strings VS Symbols

This proposal uses symbols instead of strings for identifying package names in order to better distinguish it from existing import syntax. See below:
// Symbols (proposed)
package com.example {
  import { SomeClass } from com.foo;
}

// VS Strings (alternative)
package "com.example" {
  import "com.foo.SomeClass";
  // or???
  import { SomeClass } from "com.foo";
}
The above code appears to clash with existing import syntax due to the change in behavior of how the module path strings are evaluated. By using symbols instead it helps isolate iterations on the package keyword's implementation over time.