asvd/globals-vs-export.md

## globals-vs-export.md

      
    Raw
  

              globals-vs-export.md
            
          
    Why declaring globals is better than exporting

The module pattern

During the past several years the way of managing JavaScript
dependencies evolved bringing some advanced solutions. One of the
concepts which became very popular today, is a module pattern. The
beginning of this
article
explains the idea pretty well. This concept was then reused in many
modern dependency-management solutions, and was finally suggested as a
specification of AMD
API, the most known
implementation of which is probably the
RequireJS.
Exporting feature, the problems of which are discussed in this text,
is a part of the module pattern. Strictly speaking, the module pattern
itself has no relation to the dependency resolution, it is rather
designed for managing and storing the data in a special way. But it
naturally integrates as a base for the dependency management
libraries.
The module pattern is based upon a contcept of a function which takes
arguments and returns a value, where arguments stand for the module
dependencies, and returned value is an object provided by a
module. The core part of a module pattern is a function expression
which may look like this:
// objects created by dependencies are provided as arguments
function( dep1, dep2, dep3 ) {
    // perform the needed actions to build-up some new library objects
    var routines = ...

    // export the created objects
    return routines;
}
This function will be called by a dependency management library as soon
as the module dependencies are ready, and the objects created by the
dependencies will be provided as arguments. Inside the function body,
the module builds-up its routines and returns the created object. This
object will then be handled by the dependency management library, and
will later be provided in a similar way to other modules which will
require this module as a dependency.
The main point of exporting is that the exported objects never get
outside - they do not mess-up the global namespace, and are only
provided exactly to the modules where demanded.
In the module pattern described above, the exporting stands for a
data transfer from one module to another, explicitly specifying for
each object the module which should provide it. Bonus feature is an
opportunity to use the local scope of the function to keep some
private data.
The similar approach of transfering objects between the modules is
used in the CommonJS
specification implimented in Node.JS. There is
no fabric function for each module, but the logic of exporting is the
same - the module providing the needed object should be explicitly
specified.
This idea applied to the module dependency management system gives a
nice picture: each module is stored in a separate file, the needed
objects are provided directly by its dependencies, and the module
itself defines which objects will it provide. And it looks pretty well
until being put into real life conditions.
First of all, it appears that such approach is not very scalable - it
takes an effort to split a module which has grown too big, into
several pieces: if a part of a module logic will go to a new module,
all the dependent modules should be updated to properly export the
dettached routines from the new module, and therefore the links
between the exported and imported objects should be set up again.
Similar problem appears when we need to make a common module for
loading several other modules which are often used at once. Because of
the exporting, such common module should first import the objects from
those modules, put the data into a common object, and export this
object further. For instance, in Node.js such a common module could
look like this:
common = {
  dep1: require('dep1'),
  dep2: require('dep2'),
  dep3: require('dep3')
};

module.exports = common;
And it would be fine, if we just need to include this common module
instead of the three original dependencies, but the usage of the
imported objects should now also be updated. So if previously a
dependency was used like this:
var dep1 = require('dep1');
dep1.doSomething();
now it should be reused in a new way:
var common = require('common');
common.dep1.doSomething();
And this should be updated for every use-case of the imported object.
Here we can also point out another inconvenience brought by the
exporting: it appears that the API of a library depends on how the
library is organised (because the exported objects are hardly linked
to the modules structure). This complicates the refactoring: if you
wish to remake the module structure, you will also have to make an
effort to keep the library API. In fact there's not too much work, but
as result people often use to implement a librariy as a single huge
module instead. This still works for resolving the dependencies, but
the idea of splitting the big code into the smaller parts is already
ruined at this point.
Another problem is that with this approach one has to write a lot of
subsidiary stuff for each module. In addition to the fabric function
listed in the module pattern example above, for a dependency
management system we also need to identify somehow the modules which
provide the objects substituted as arguments (list their paths or some
kind of module ids). Additionally we need to define the module
instance itself in a special way so that it could be recognized by a
dependencies management system to be later reused by modules which
will need it as a dependency.
This could be illustrated by how dependencies are specified in
RequireJS. Declaration of a module with dependencies could look like
this:
define(
    ['dep1', 'dep2', 'dep3'],
    function( dep1, dep2, dep3 ){
        ...
    }
);
The first argument of the define() function is a list of a module
identifiers, and the objects exported by that modules are mapped to
the arguments. The code obviously becomes more complicated if there
are more dependencies:
define(
    [       'dep1', 'dep2', 'dep3', 'dep4', 'dep5', 'dep6', 'dep7', 'dep8'],
    function(dep1,   dep2,   dep3,   dep4,   dep5,   dep6,   dep7,   dep8){
        ...
    }
);
Now there is much more chance to make a mistake. To solve this, the
creators of RequireJS invented another way of listing dependencies and
mapping them to the exported objects (this solution is called the
'simplified CommonJS'):
define(
    function (require) {
        var dep1 = require('dep1'),
            dep2 = require('dep2'),
            dep3 = require('dep3'),
            dep4 = require('dep4'),
            dep5 = require('dep5'),
            dep6 = require('dep6'),
            dep7 = require('dep7'),
            dep8 = require('dep8');

            ...
    }

});
This way of specifying the dependencies is easier to read and more
convenient to use. But now we have the second way to do the same
thing, and the amount and structure of the code needed to set-up the
dependencies is just outstanding. Why can't we simply list the
dependencies? The only reason is that we also need to specify a
correspondence between the dependency and the object it exports.
These complications are only brought by the exporting feature,
particulary by the fact that the exported objects are always linked to
the modules which export them. Other aspects of the module pattern
make no problem: putting the code into a function still provides a
convenient way for managing private data not intended to be exported
(by using function local variables), and this function could also be
used by a dependency management solution to be called at the
appropriate time (when all the dependencies are ready).
Therefore exporting forces programmers to pass the data through each
module, and it should be done for each exported object. This rule is
actually a limitation, because it implies that a module always results
into an object.
Solution

The modular configuration of a library is a matter of internals, and
it should be arranged according to the library structure. On the other
hand the API of a library should be created according to how the
library should be used. In case of export approach these two things
are linked together, and therefore the internal structure of a library
influences the API upon each refactoring, as shown above. If we break
this link, the issue will be solved. How could this be achieved?
In the existing export-based solutions, a module (along with the
exported object) is identified by its name (file path, or some kind of
module id, which is then resolved by a module loading system). This
identifier is a string which refers a module in some kind of globally
accessible registry (filesystem, or an external config defining the
module ids).
We could create another similar but independent registry for storing
the library objects, and let modules decide themselves when and what
they wish to create on that registry. Such approach would mean a
switch the modules behaviour from 'producing objects' to 'performing
actions' (while the action could also mean producind an object, but
not necessarily).
Now to get a library object, we will ask the objects registry. The
dependency declaration code is simplified: it should be enough to
simply list the needed dependencies in the module head, and all these
workarounds for making up the correspondance between the modules and
objects, are not needed anymore.
Let us try to figure out what kind of registry should it be. A module
should be able to create an object on that registry, and this object
should be accessible by its identifier from any part of code. I guess
you already pointed out what I am implying - we already have this kind
of registry. This is the global scope. But everyone knows that using
global scope is a bad practice, isn't it?
Globals

Well, not exactly. Globals are bad when used without any control, by
creating a global whenever a variable is needed. But if we there would be
a single globally accessible registry, conventionally named something like
LIB, and containing the library objects, each for a single library
routines, each named the same as the library -
that would be similar to referring the exported object by the module path
or identifier (which are also global as explained above).
In the export approach there is a convention according to which the
module provides an object to export. Following this convention makes
the exporting aprroach work. But if a module for some reason does not
export an object, it will result in no object upon import. Storing the
library object in that kind of registry with the name of the library,
is the similar kind of convention.
Dark Future

The upcoming new ES6 standard includes the native module concept which
also provides the exporting feature. It is a bit more advanced
(comparing to the simple exporting reviewed in this text), in sence
that it allows a module to export several objects at once, and then to
specify a particular object to be imported form a module. Nevertheless
the library objects are still linked to the module and are identified
by the module, which means that discussed issues also apply to that
apprach.
Moreover, instead of simplifying the task of a library refactoring and
splitting it into smaller modules, the feature of importing a
particular object provided by a module, implies that the whole library
is located inside a single module. In fact this simply legalizes the
huge single-module libraries!
Prohibiting implicit globals is just great, but should we really treat
this new kind of exporting as a 'good practice', or rather as yet
another anti-pattern which in fact only consumes the developer's
effort to support itself?
Afterword

This text is an attempt to explain the decision on the module
format for the Helios Kernel
loader. After the library
release, I sometimes received a feedback with complains about that
its module format does not allow to export the objects created by
modules. In fact, this is not the case - it is still possible to
implement any approach for managing the created objects on the top of
Helios Kernel (just like existing solutions are implemented on the
top of lower-level browser API). But this is not necessary: instead
it is suggested to follow more flexible approach and treat the
modules not as 'producing objects', but rather as 'performing
actions' (so that it would be the modules which can do, not only
make), as described in this text.
Comments and suggestions are welcome
--
You can find me on twitter: https://twitter.com/asvd0
Also check out some of my projects on github (ordered by my impression of their significance):
Helios Kernel: isomorphic javascript module loader
Jailed: a library for sandboxed execution of untrusted code
Lighttest: isomorphic unit-testing library