Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@piscisaureus
Created February 10, 2012 02:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save piscisaureus/864425f7578fb261683a to your computer and use it in GitHub Desktop.
Save piscisaureus/864425f7578fb261683a to your computer and use it in GitHub Desktop.
domains

Preface

The past couple of months I have been thinking about domains, but in the end I got stuck. This is basically a summary of my ideas. There are many conceptual problems with it, and I'd like your feedback. Maybe together we can work out something nice.

The problem

Node has this awkward problem that if an error event goes unhandled and or an exception is not caught, node exits. This is undesirable. In many cases it would be fine to just stop some "operations", report it somewhere, and continue all unrelated work. Let's look at a simple website; if an error occurs in template rendering we should close the incoming http connection, and possibly also open file handles and database queries. Node should be able to do this automatically; it is only needed when the user doesn't handle a particular error so we cannot rely on the user to configure domains properly.

The general idea

The general idea is to associate i/o, timer and nextTick related callbacks with a particular domain. Each of these generate transitions from the event loop to javascript code, and on every transition into javascript we set the domain. When an uncaught exception happens within the domain, all io handles are closed and all nextTick and timer callbacks are cancelled. By default callbacks are attached to the currently active, so:

// We run this in domain A
setTimeout(function() {
  // This also happpens in domain A
}, 100);

This means that we can track most of the stuff that needs to be cleaned up when a domain "dies".

// Inside domain A
setTimeout(function() {
  throw "ohno";
}, 100);

setTimeout(function() {
  // This never runs because it is cancelled on throw.
}, 200);

Unfortunately we cannot cleanup any (global) state that was created by a domain. We think that is fine - global state is typically rare and people should just attach a listener that cleans up global state when a domain dies.

// We run this in domain A
global.foo = bar;
setTimeout(function() {
  // This also happpens in domain A
  throw "failure";
  delete global.foo; // This never happens, so we leak global.foo
});

Creating a domain - simple case

Creating a domain can be as simple as running something inside a closure.

var domain = require('domain').create(function() {
   var conn = net.createConnection(80, 'wwww.google.com');
   var file = fs.createReadStream('somefile');
   file.pipe(conn);
});
domain.on('error', function() {
  // This doesn't run in the domain itself, but rather in the 'parent' domain,
  // which is the domain that was active when `domain.create` was run.
  // Clean up remaining global state here.
});
domain.on('end', function() {
  // This is always called - exactly once, when a domain ends.
  console.log("we're done!");
})

Cross-domain access

It gets hairy when we consider cases where people might need to access an i/o object (e.g. a Stream or Server object) from another domain than the domain it was created in. We could choose to disallow that, but that will probably make a lot of things impossible.

Incoming connections

A typical example would be an incoming connection. There is no way to create the connection inside a domain - it is created by net.Server and thus will inherit its domain from the server object.

// In domain A
http.createServer(function(req, res) {
  // We're still in domain A. Now we want to do some stuff with the response object.
  domain.create(function() {
    // We're now in domain B.
    res.write("HELLO", function writecb() {
      // There is no way to tell that `writecb` was created in domain B.
      // The only thing we know is that we are in a callback that comes from i/o
      // handle `res`, which is in domain A. So this will kill domain A.
      throw "omg";
    });
  });
  domain.on('error', function() {
    // It would be nice if we could do this:
    res.writeHead(500);
  });
});

As you can see it is probably undesirable to allow accessing an i/o object from a foreign domain. If you do something that takes a callback, the callback would - unexpectedly - run in the domain that created the io object, instead of the domain that created tha callback.

Shared connection pool

Suppose we create a database library that lazily creates connections to the database. Let's say we do two queries, one from domain A and another from domain B. This must end badly! If we allow people to access objects that are not in their own domain, weird stuff happens. If the query succeeds, the result callback will be run in domain A - even if the query was posted from domain B! On the other hand, if we don't allow cross-domain access then query B will just throw, because it's trying to access the shared connection from the "wrong" domain.

var db = require('driver'),
    domain = require('domain');

domain.create(function() {
  // In domain A.
  db.query("SELECT foo FROM bar", function(result) {
    // In domain A
  });
});

domain.create(function() {
  // In domain B.
  db.query("SELECT foo FROM baz", function(result) {
    // In domain A(!!!)
  });
});

Strawman proposals

Here are some proposals that are worth considering. Please help identify drawbacks and edge cases that need to be addressed.

Strawman: only allow messaging

Don't allow access to i/o objects from foreign domains. Allow only messaging between domains.

// Domain A
http.createServer(function(req, res) {
  var d = domain.create(function() {
    // Domain B. 
    d.onReceive('input', function(data) {
      // Also in domain B.
      process.nextTick(function() {
        // Also in domain B
        mumbojumbo(data, function(result) {
          // Also in domain B
          d.emit('output', result);
        });
      });
    });
  });
  req.on('data', function(data) {  
    // In domain A
    d.send('input', data);
  });
  d.on('output', function(data) {
    // Also in domain A
    res.write(data);
  });
  d.on('error', function() {
    // Error in the domain. Too bad.
    res.writeHead(500);
  });
  res.on('error', function() {
    // We need to handle this. `res` is associated with domain A do an 
    // unhandled could bring everything down.
  });
});

Strawman: callback tracking

TBD (not very good)

Your proposal here?

@isaacs
Copy link

isaacs commented Feb 10, 2012

Typo: fifth code block, line // We're now in domain A. should be // We're now in domain B.

What about explicit callback wrapping and object ownership? This is actually not so bad, I think.

// in domain A
http.createServer(function (req, res) {
  var b = domain.create(function () {
    res.write("HELLO", b.bind(function writecb() {
      throw "this kills domain B, not A"
    }))
  })
  b.on("error", function (er) {
    console.error("Error encountered", er)
    // res.writeHead(500), res.destroy, etc.
    cleanup(res)
  })
  // treat these EE objects as if they are a part of the b domain
  // so, an "error" event on them propagates to the domain, rather
  // than being thrown.
  b.add(req)
  b.add(res)
})

I think that something like this, if it could be wise about actually shutting down connections, killing outstanding libuv threads, etc., would be very useful. Kind of like substack's node-toss, but with the proper low-level hooks.

@isaacs
Copy link

isaacs commented Feb 11, 2012

Or, even, what about something like this?

// in domain A
http.createServer(function (req, res) {
  var b = domain.create();
  res.write("HELLO", b.bind(function writecb() {
    throw "this kills domain B, not A"
  }))
  b.on("error", function (er) {
    console.error("Error encountered", er)
    // res.writeHead(500), res.destroy, etc.
    cleanup(res)
  })
  // treat these EE objects as if they are a part of the b domain
  // so, an "error" event on them propagates to the domain, rather
  // than being thrown.
  b.bind(req)
  b.bind(res)
})

So, the b.bind function wraps either a cb function, or an event emitter. Error events on a domain-bound EE, or throws from a domain-bound cb, cause the domain to error. Maybe event handlers on a domain-bound EE are like domain-bound callbacks.

An interesting edge case would be if an event handler is already bound to a different domain than the EE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment