misterdjules/post-mortem-debugging-with-promises.md

## post-mortem-debugging-with-promises.md

      
    Raw
  

              post-mortem-debugging-with-promises.md
            
          
    Challenges in using postmortem debugging with promises

Introduction

This document seeks to summarize recent discussions about postmortem
debugging, promises, and using them together. It starts by presenting
what post-mortem debugging use cases are impacted negatively by the usage of
promises. Then it describes some potential solutions that have been tried to
fix these issues and their shortcomings.
The problem

A quick overview of post-mortem debugging

Post-mortem debugging users rely on the convention that uncaught exceptions in
a Node.js program terminates the process, and generate a core file that
preserves the state of the process at the time the uncaught error was
thrown.
The following Node.js program:
function baz() {
  throw new Error('boom');  
}

function bar() {
  baz();  
}

function foo() {
  bar();
}

foo();

will make node abort and generate a core file when run with the --abort-on-uncaught-exception command line option:
$ ulimit -c unlimited # allow core files of unlimited size to be generated
$ node --abort-on-uncaught-exception /tmp/boom.js
Uncaught Error: boom

FROM
baz (/private/tmp/boom.js:2:4)
bar (/private/tmp/boom.js:6:4)
foo (/private/tmp/boom.js:10:4)
Object.<anonymous> (/private/tmp/boom.js:13:1)
Module._compile (module.js:435:26)
Object.Module._extensions..js (module.js:442:10)
Module.load (module.js:356:32)
Function.Module._load (module.js:311:12)
Function.Module.runMain (module.js:467:10)
startup (node.js:134:18)
node.js:961:3
[1]    8291 illegal hardware instruction  node --abort-on-uncaught-exception /tmp/boom.js
$ ls /cores
core.8291 # a core file was just generated for the process with pid 8291
$ 

When examining this core file, it is critical that it represents the state
of the process when the error was thrown. For instance, the functions foo, bar
and baz _must be on the call stack.
Note however that knowing the call stack at the time the error threw is not
sufficient to use the post-mortem debugging methodology. The whole state of
the process, including the state of the heap and internal data structures,
among other things, are required.
It is also worth noting that this behavior is similar to what happens when
not passing --abort-on-uncaught-exception on the command line:
$ node /tmp/boom.js
/private/tmp/boom.js:2
    throw new Error('boom');  
    ^

Error: boom
    at baz (/private/tmp/boom.js:2:10)
    at bar (/private/tmp/boom.js:6:4)
    at foo (/private/tmp/boom.js:10:4)
    at Object.<anonymous> (/private/tmp/boom.js:13:1)
    at Module._compile (module.js:435:26)
    at Object.Module._extensions..js (module.js:442:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:311:12)
    at Function.Module.runMain (module.js:467:10)
    at startup (node.js:134:18)
$ echo $?
1
$ 

The process exited with a non-zero exit code. In other words, post-mortem
debugging only instruments an existing behavior and convention, which is:
any uncaught exception causes a node program to exit. The instrumentation
provided by --abort-on-uncaught-exception is that, in addition to that, it
generates a core file that can be inspected at any time to do post-mortem
debugging.
It is also worth mentioning that uncaught errors thrown synchronously or
asynchronously trigger the same behavior, the only difference being, in the
case of an uncaught error thrown asynchronously, that the call stack doesn't
include the functions that scheduled the callback which eventually threw the
uncaught error. The functions that triggered the uncaught errors are still
present on the call stack though, which is what matters.
For instance, let's consider this sample code:
$ cat /tmp/boom-async.js
function baz() {
  throw new Error('boom');
}

function bar() {
  baz();
}

function foo() {
  bar();
}

function scheduleBoom() {
  setImmediate(function triggerBoom() {
    foo();
  });
}

scheduleBoom();
$ node --abort-on-uncaught-exception /tmp/boom-async.js 
Uncaught Error: boom

FROM
baz (/private/tmp/boom-async.js:2:3)
bar (/private/tmp/boom-async.js:6:3)
foo (/private/tmp/boom-async.js:10:3)
Immediate.triggerBoom [as _onImmediate] (/private/tmp/boom-async.js:15:5)
processImmediate [as _immediateCallback] (timers.js:368:17)
[1]    8441 illegal hardware instruction (core dumped)  node --abort-on-uncaught-exception /tmp/boom-async.js
$ echo $?
132
$ ls /cores 
core.8291 core.8441
$

We can see that the core file has triggerBoom, foo, bar and baz on the
call stack, but not scheduleBoom. This is expected and doesn't make the
generated core files any less useful for post-mortem debugging users.
Fundamental requirements of post-mortem debugging

From this quick overview of post-mortem debugging, we can deduce the following
fundamental properties of Node.js that make post-mortem debugging tools and
processes usable:


Node processes terminate when an uncaught exception is thrown


When node processes terminate due to an uncaught exception, a core file can be
generated by passing a specific command line argument to the node program


A core file generated due to an uncaught exception must preserve the state of
the whole process at the time the error was thrown. This includes preserving
the call stack, but also the heap, and any resource that is associated to a
process (file descriptors table, etc.).


Using post-mortem debugging with promises

Now let's consider some same sample code that throws an uncaught error but
this time from a promise.
function baz() {
  throw new Error('boom');
}

function bar() {
  baz();
}

function foo() {
  bar();
}

var boomPromise = new Promise(function (resolve, reject) {
  foo();
});

Promises can be used in many different ways, each having an impact on the
problems and solutions discussed in this document. This example may seem
contrived, but it's still a valid use case and most importantly it's the
simplest use  case to consider for discussing the problems and potential
solutions. As such, it seems that it's a good starting point to explore the
problem space.
Now let's run this program without --abort-on-uncaught-exception:
$ cat /tmp/boom-promises.js
function baz() {
  throw new Error('boom');
}

function bar() {
  baz();
}

function foo() {
  bar();
}

var boomPromise = new Promise(function (resolve, reject) {
  foo();
});
$ node /tmp/boom-promises.js
$ echo $?
0
$

We can see that, even though the program throws an uncaught error, the process
exits gracefully with the exit code 0. Running the same program with --abort- on-uncaught-exception on the command line produces the same results:
$ node --abort-on-uncaught-exception /tmp/boom-promises.js
$ echo $?
0
$

This behavior is documented by the EcmaScript 2015 standard, and the program
behaves as intended. The problem is that this behavior diverges from the
behavior we've seen so far with regards to uncaught exceptions: instead of
making the process exit, uncaught exceptions thrown from within promises keep
the process running.
In other words, the fundamental post-mortem debugging requirements mentioned
above are not met.
The impact of adding a promises-based API to node's core APIs

The fact that the promises' error handling model has a negative impact on some post-mortem debugging use cases is not a problem in itself. Indeed, post-mortem debugging users could choose to:

Not use promises in their code.
Avoid using modules that use promises.

However, adding an API based on promises to Node.js' core makes it much more
difficult for post-mortem debugging users, and for Node.js users who rely on
the current Node.js' error handling model, to avoid using them or to avoid
using third-party code that uses them.
Potential solutions

Several solutions have already been suggested and tested without sucess. This
section describes the ones that seemed to be the most promising but failed in
a way that lead to valuable feedback in order to come up with other potential
solutions and eventually with a better understanding of the problem.
Using V8's SetPromiseRejectCallback

V8 exposes the SetPromiseRejectCallback API for embedders to register a
function that is called when a promise is "rejected". A promise is rejected,
among other cases, when an uncaught exception is thrown from its resolver or a
then handler.
Shortcomings

The problem with using this API, as described in an earlier GitHub
comment,
is that the function that is registered through this API is called
asynchronously, after all microtasks that had been scheduled have run. As a
result, generating a core file for the process at that point does not satisfy
the fundamental requirement #3 for post-mortem debugging:

A core file generated due to an uncaught exception must preserve the state of
the whole process at the time the error was thrown. This includes preserving
the call stack, but also the heap, and any resource that is associated to a
process (file descriptors table, etc.).

Indeed, between the time the uncaught error was thrown and the
PromiseRejectCallback is called, any of the following things can happen:


Any JS object can be garbage collected.


The stack changed, and the data that was stored in it at the time the error
was thrown may have been partially overwritten.


Any of the process' internal data structures may have changed.


Removing implicit try/catch blocks from V8's promises implementation

Technical details

The reason why uncaught errors thrown from promises do not make a node process
terminate is that promises implicitly catch all exceptions. This is done in
several places in V8's implementation:

When a promise's resolver is called.
When a promise is "handled".
When a microtasks' queue's task is executed.

The changes made to node (and V8) to test this potential solution are
available in a branch of my personal fork of Node.js.
The approach taken by this potential solution is to not perform these
operations from within try/catch blocks when:

The --abort-on-uncaught-exception command line option is passed to a node program.
There is no catch handler at the time the code runs.

Point 2) is implemented by calling PromiseHasUserDefinedRejectHandler
when a promise's resolver is called and
when a promise is handled.
Problems solved

This solution solves some of the problems mentioned in the introduction. For
instance, let's consider the first code sample from "Using post-mortem
debugging with promises":
function baz() {
  throw new Error('boom');
}

function bar() {
  baz();
}

function foo() {
  bar();
}

var boomPromise = new Promise(function (resolve, reject) {
  foo();
});

Running this program with --abort-on-uncaught-exception will make the
process abort at the time the error is thrown.
Moreover, the following test:
function foo() {
  throw new Error('boom');
}

function boom() {
  foo();
}
var boomPromise = new Promise(function (resolve, reject) {
  return resolve(42)
}).then(function() {
  boom();
}).catch(function() {
  console.log('caught');
});

doesn't make the process exit or abort (depending on whether --abort-on- uncaught-exception is used) because there's a catch handler. This is what
post-mortem debugging users expect and it solves the problem for this use
case.
Other tests are included with that change which make sure other common use
cases work as expected. However, it is very likely that not every possible
use case is covered. That change is more a proof of concept than a definitive
solution.
Shortcomings

False positives (or use cases when a node program aborts when it should not)

The following code:
function foo() {
  throw new Error('boom');
}

function boom() {
  foo();
}
var boomPromise = new Promise(function (resolve, reject) {
  boom();
}).catch(function() {
  console.log('caught');
});

aborts when run with --abort-on-uncaught-exception, even though the error
should be caught by the catch handler.
Fundamental design issues

Even though that solution seems to solve a lot of the problems mentioned in
the introduction, there is at least one fundamental problem with this
approach: it changes the semantics of throwing an uncaught error depending on
whether --abort-on-uncaught-exception is used.
Developers are not able to know, at the time they write the code that throws
uncaught exceptions from within promises, whether they'll make the process
exit (or abort) or if they'll be implicitly caught.
@chrisdickinson however made an excellent point when he mentioned that, since
any custom unhandledRejection event listener can be registered on the
process object, including one that aborts on any rejection, promises users may
already not be able to make any assumption about how unhandled rejection will
be handled by a node program.
Aborting if a promise is on the stack at the time Isolate::Throw is called

This approach was suggested by @vkurchatkin in a comment to nodejs/post-mortem#16.
Here's the patch that implements it:
diff --git a/deps/v8/src/isolate.cc b/deps/v8/src/isolate.cc
index 04198bb..575beb6 100644
--- a/deps/v8/src/isolate.cc
+++ b/deps/v8/src/isolate.cc
@@ -1001,8 +1001,21 @@ Object* Isolate::Throw(Object* exception, MessageLocation* location) {
     debug()->OnThrow(exception_handle);
   }

+  Handle<Object> promise = GetPromiseOnStackOnThrow();
+  bool possible_uncaught_promise = false;
+
+  if (promise->IsJSObject()) {
+    Handle<JSObject> jspromise = Handle<JSObject>::cast(promise);
+
+    Handle<Object> has_reject_handler;
+    ASSIGN_RETURN_ON_EXCEPTION_VALUE(
+        this, has_reject_handler,
+        debug()->PromiseHasUserDefinedRejectHandler(jspromise), nullptr);
+    possible_uncaught_promise = has_reject_handler->IsFalse();
+  }
+
   // Generate the message if required.
-  if (requires_message && !rethrowing_message) {
+  if ((requires_message && !rethrowing_message) || possible_uncaught_promise) {
     MessageLocation computed_location;
     // If no location was specified we try to use a computed one instead.
     if (location == NULL && ComputeLocation(&computed_location)) {
@@ -1025,7 +1038,8 @@ Object* Isolate::Throw(Object* exception, MessageLocation* location) {
       // or if the custom callback determined that V8 should abort, then
       // abort.
       if (FLAG_abort_on_uncaught_exception &&
-          PredictExceptionCatcher() != CAUGHT_BY_JAVASCRIPT &&
+          (PredictExceptionCatcher() != CAUGHT_BY_JAVASCRIPT ||
+            possible_uncaught_promise) &&
           (!abort_on_uncaught_exception_callback_ ||
            abort_on_uncaught_exception_callback_(
                reinterpret_cast<v8::Isolate*>(this)))) {

All requirements presented in the section entitled "Fundamental requirements
of post-mortem debugging" are met by this approach.
Potential issues

Inability to catch uncaught exception from promises

As presented in another comment in the same GitHub
issue,
the following code aborts when run with that proposed change:
function bar() {
  throw new Error('boom');
}

function foo() {
  bar();
}

function boom() {
  return new Promise(function() {
    foo();
  });
}

try {
  boom();
} catch (e) {
  console.log('caught');
}

It's not clear if that's a problem: the promises spec seems to mean that no
uncaught exception can bubble up from a promise, and thus changing that
behavior when running node with --abort-on-uncaught-exception (or any other
flag that would enable abort on a synchronous uncaught exception raised from
within a promise) might not make sense.
Inability to catch uncaught exception from promises "awaited" with await

Similarly, it seems that the following code would make node abort:
function bar() {
  throw new Error('boom');
}

function foo() {
  bar();
}

function boom() {
  return new Promise(function() {
    foo();
  });
}

try {
  await boom();
} catch () {
}

It's still not clear whether awaited promises should be able to bubble up
uncaught exceptions, and thus this may not be an issue. Other resources on the
web seem to use try/catch blocks to catch exceptions from awaited functions
though. See for instance http://pouchdb.com/2015/03/05/taming-the-async-beast-with-es7.html
and https://jakearchibald.com/2014/es7-async-functions/, which
both contain sample code that use try/catch blocks with awaited promises.
Relevant work done in other JavaScript hosting environment

Chrome

Chrome seems to have had similar
issues with regards to
how promises' unhandled errors (generated both by throw or by an explicit
rejection) should be handled by chrome.
Conclusion

The third experiment proposed by @vkuratchkin seems to be very promising but
some questions still remain to determine whether this could be an acceptable
solution for promise and post-mortem debugging users.