Skip to content

Instantly share code, notes, and snippets.

@benjamingr
Created February 11, 2016 18:58
Show Gist options
  • Save benjamingr/3fb97eda723ff2a030d1 to your computer and use it in GitHub Desktop.
Save benjamingr/3fb97eda723ff2a030d1 to your computer and use it in GitHub Desktop.

Post Mortem Debugging in NodeJS in the Light of Promises

Context:

we're discussing how we can safely abort on unhandled rejections and obtain meaningful debugging information. Related reading: https://gist.github.com/misterdjules/2969aa1b5e6440a7e401#file-post-mortem-debugging-with-promises-md.

In particular, we're discussing https://gist.github.com/misterdjules/2969aa1b5e6440a7e401#removing-implicit-trycatch-blocks-from-v8s-promises-implementation

Othe recommended reading: On unhandledRejection https://gist.github.com/benjamingr/0237932cee84712951a2

###Idea

We add a --abort-on-sync-unhandled-rejection flag to Node. That flag's goal is to give meaningful core dumps in Node while letting the vast majority of promise users remain unaffected. This is related to the issue of promisifying core-apis but it is aimed at also solving the problem for other users.

That flag aborts the process and generates a core dump whenever a promise is rejected and that rejection is not recovered from - similarly to --abort-on-uncaught-exceptions.

When is a promise rejection not handled.

The problem is that we never "know for sure" when a promise has rejected and that rejection cannot be recovered form. We need to make a reasonable compromise on when a promise has rejected and no catch handlers will be attached to it.

  • Using the unhandledRejection event is unfit, like @misterdjules and @chrisdickinson have said - the dump is much less meaningful at this point. It triggers after the microtask queue has been depleted - this eliminates some false positives, but not all.
  • Assuming promise errors will not be recovered from is also unfit - since in practice promise code throws liberally for operational errors.

A compromise can be reached by dumping the core if no rejection handlers have been attached up until the point of the error thrown. Since code in then blocks always runs asynchronously this means the vast majority of use cases are covered. We also have a meaningful core at this point to dump since we're actually during running the callbacks.

Promise.resolve().then(() => {
   throw new Error(); 
}).catch(e => { // since the catch handler is added before the `then` executes, this is fine.
   // ...
});
var p = Promise.resolve().then(() => {
   throw new Error(); 
});

p.catch(e => { // since the catch handler is added before the `then` executes, this is fine. No false positive
   // ...
});
var p;
Promise.resolve().then(() => {
   throw new Error(); 
}).

setTimeout(() => p.catch(e => { // this aborts, like Chris said, code can always be written to avoid this case.
   // ...
}));
var p;
Promise.resolve().then(() => {
   throw new Error(); 
}).

process.nextTick(() => p.catch(e => { // this aborts, like Chris said, the "unhandledRejection" event avoids this problem
                                     
   // ...
}));

While personally in my code I don't use unhandledRejection for anything except actual errors which means I can throw in it (causing an abort) - this could definitely break some peoples' flow.

The flag terminates the process with a heap dump on any rejection on a promise if the rejection handler is not added synchronously. This effectively means "has not been already added" since then callbacks run after the catch handlers have been added if those are added synchronously. This requires a little stricter guarantee than the slack we give in unhandledRejection, in practice I just checked in my code and my code and some packages I use behave well.

There is a flow described in the original post here: groundwater/nodejs-symposiums#5 (comment) that also shows how it can be used to provide post-mortem with responsible aborting (nodejs/post-mortem#18)

Advantages:

  • Terminating it synchronously means we get a decent heap dump, post-mortem people are happy.
  • Promise code isn't changed, promises users are happy.
  • Provides a way to write highly available NodeJS code, fixing the issue described in nodejs/post-mortem#18 .
  • Very small API surface, a single flag is added.

Disadvantages:

  • "Misbehaving" promise code can't be used, this flag requires people to write their promise code in a certain way if they want post-mortem debugging.
    • It might be possible to mitigate that through starting to warn on unhandledRejection although that won't always fix it.
  • Doesn't give us post-mortem debugging for every request but rather only the first, I think this is the way forward since otherwise you're exposed to a DOS attack.
  • The promise constructor would have to be wrapped and throws there will be converted to aborts rather than rejections with the flag on. In practice throwing synchronously in the promise constructor isn't common and is typically the sort of programmer error that post-mortem analysis wants to catch - so this is acceptable.

The idea is based on work by @misterdjules @chrisdickinson and @groundwater as well as previous work by numerous people.

@benjamingr
Copy link
Author

I'd really like to see "promise people" like Domenic participate here by the way. I think alienating them and shaming them publicly was a very unfortunate choice and made the project (Node) look worse than I believe we are. I'd like to see how we can move past that and reconnect with those people. Their input here is vital, this is a problem domain not that many people understand.

@misterdjules
Copy link

Thanks for sharing that! It seems -- apart from the name of the command line flag -- what I implemented in misterdjules/node-1@d9f9b0e meets all the requirements described in this document.

One concern I described about my implementation is exactly what you describe in:

The promise constructor would have to be wrapped and throws there will be converted to aborts rather than rejections with the flag on. In practice throwing synchronously in the promise constructor isn't common and is typically the sort of programmer error that post-mortem analysis wants to catch - so this is acceptable.

I'm still torn about this, mainly because I'm not a user of promises (yet), and so I can't get a feeling for how big an issue that would be.

@chrisdickinson
Copy link

Going to bring this back to the PR thread since Gist notifications are a little nonexistent :)

@stefanpenner
Copy link

The promise constructor would have to be wrapped and throws there will be converted to aborts rather than rejections with the flag on. In practice throwing synchronously in the promise constructor isn't common and is typically the sort of programmer error that post-mortem analysis wants to catch - so this is acceptable.

I'm not on-board with this approach, but I believe ^^ actually intends to be:

Throw's in Promise constructors who have no followers (or no possible catch handler at the time of the throw) are to trigger these new `aborts'. Since if they have followers, they can handle the errors.

These changes smell more and more of the old jQuery promises, which caused havoc in failure scenarios. Essentially rendering them useless.

@stefanpenner
Copy link

It is worth drawing attention to related research/work ideas to post-mortem view of (single or multi actor) asynchronous systems:

I suspect with some effort, a more comprehensive (and non-compromising) approach is possible. Obviously, that comes at the cost of time + effort. But I wouldn't rule it out.

@benjamingr
Copy link
Author

@stefanpenner note that this is only the promise constructor and not inside chains. The reason is that the promise constructor executes synchronously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment