Skip to content

Instantly share code, notes, and snippets.

@ritch
Last active August 29, 2015 14:01
Show Gist options
  • Save ritch/b66c09891e3a37fc78a8 to your computer and use it in GitHub Desktop.
Save ritch/b66c09891e3a37fc78a8 to your computer and use it in GitHub Desktop.

Lessons Learned from Stateful Modules

Background

We forked the jugglingdb module to create loopback's orm. Jugglingdb allows you to define models and the relations between them. One of the fundemental mechanisms that it uses to do this is the type registry.

You can see this implemented in the original jugglingdb project

exports.Schema = Schema;

// ...

Schema.types = {};
Schema.registerType = function (type) {
    this.types[type.name] = type;
};

What's wrong with this?

Looking at this approach initially seems fine. Types are registered against the Schema constructor. Whats the big deal? The issue is: ever module that contributes a Type to the registry must have the same reference to the Schema.types object. Node doesn't gaurentee that every module will get the same instance of the jugglingdb module.

Node caches the exports object for the schema.js module. This code in schema.js is executed every time node creates the cached version. The exports object is available through the require.cache and require().

require('jugglingdb').Schema
console.log(require.cache);

output:

...
'/tmp/test/node_modules/jugglingdb/lib/schema.js':
   { id: '/tmp/test/node_modules/jugglingdb/lib/schema.js',
     exports: { Schema: [Object] }
 ...

The important thing to remember is that this does not mean that every time you require('jugglingdb') you will get the same Schema object. When you call require('module') node resolves the module's absolute file path. This path is used as the id for the modules entry in the require.cache.

A good rule of thumb is, if two require() calls resolve to the same file path, they will always return the same module exports objects. As long as you can be certain what the file path will resolve to for a given require() you can be sure you will get the object you expect.

Consider the following setup using npm to install two moules. If module a depends on module b which depends on module a, you will get 2 instances of module a in the require.cache. Even if both of the modules are the same version. This is because module-a exists in two different node_module directories.

The npm dedup command allows you to simplify your dependency tree and avoid multiple instances of a dependency in the cache. This is only a workaround for the jugglingdb issue since the dedup drastically changes the structure of the require.cache.

Consider the following contrived application:

// module-a
var jugglingdb = require('jugglingdb');
var moduleA = require('module-a'); // depends on jugglingdb

If both module-a and module-b depend on jugglingdb via package.json. After npm install the require.cache would include two instances of the Schema.types object. After npm dedup it would only include one that would be shared by both modules.

A better approach

As a module developer, you should assume that your module may exist multiple times in the require.cache. You should avoid requiring data to be shared accross those module instances. Otherwise you should tell your users to run npm dedup when using your module. Keep in mind that if someone is depending on your module they will have to tell their users to do this as well. If you exist anywhere in a user's dependency tree, the user will have to run dedup to ensure not having multiple instances of your module in the require.cache.

Instead of using the constructor, you should prefer storing the state on the instance itself.

function Schema() {
	// ...
	this.types = {};
	// ...
}

Schema.prototype.registerType = function (type) {
    this.types[type.name] = type;
};

Now there is no dependency on the actual Schema constructor's state. The downside is that this doesn't support the same behavior as before. With the previous implementation, you could register a type without a Schema instance.

require('jugglingdb').Schema.registerType(...);

This can be preserved by making the Schema class a true singleton or using a global.

// singleton
if(global.Schema) {
  exports.Schema = global.Schema;
  return;
}
// ...
global.Schema = function Schema(/*...*/) {
  // ...
}

// ...or global
Schema.types = global.__schemaTypes || (global.__schemaTypes = {});
Schema.registerType = function (type) {
  this.types[type.name] = type;
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment