Question: suppose we wanted to let users mostly deal with individual exception objects instead of ExceptionGroups, and we want to do that without losing the rich traceback metadata. (Tree structure, notes about places where exceptions crossed between tasks, ...) What would that look like?
The big challenge is that in current trio.MultiError
, if you want to extract an individual exception and make it a proper standalone object, you have to mutate its __traceback__
attribute to capture the info that's otherwise only present in the MultiError
object. (And this in turn is the main reason I've argued for except ValueError as exc
being run once with a single ExceptionGroup
: if we iterate, then we have to extract individual exceptions. If we wrap in a new ExceptionGroup
, then we never have to mutate any exception objects in place.) So if we can solve this mutation problem, then it unlocks a lot of new possibilities.
Requirements would be:
-
Given an individual exception
exc
, you can print its traceback. -
Given several individual exceptions
exc1
,exc2
, ..., you can reconstruct the joint traceback tree -
Extracting an individual exception from a group should not require mutating the exception object.
-
During exception propagation, appending frames to the traceback needs to be super cheap (ideally O(1) for an arbitrary-sized exception group). Most tracebacks are never examined at all, so it's OK if printing them requires some extra work as long as creating them is as cheap as possible.
-
Leaf exceptions don't hold strong references to other leaf exceptions (e.g. you can't do
exc1.__traceback__.something.children[0]
to accessexc2
.
Right now, tracebacks are stored as a single-linked list where the head is the newest frame:
@dataclass
class Traceback:
tb_frame: Frame
tb_next: Optional[Traceback]
...
# Used as:
#
# exc -> outermost Traceback -> Traceback -> ... -> innermost Traceback
# ^
# \ new Traceback objects inserted here
def push_frame(exc, frame):
exc.__traceback__ = Traceback(tb_frame=frame, tb_next=exc.__traceback)
This is a super elegant representation and makes pushing new frames onto the traceback cheap, but it can't efficiently represent traceback trees, where a single "outer" frame may have multiple "inner" frames. If you wanted to use this representation while propagating N exceptions, you'd need to create N duplicate Traceback
objects for each frame:
# exc1 -> outermost Traceback -> ... -> innermost Traceback
# exc2 -> outermost Traceback -> ... -> innermost Traceback
# ...
def push_frame(excs, frame):
for exc in excs:
exc.__traceback__ = Traceback(tb_frame=frame, tb_next=exc.__traceback__)
This is expensive, and also makes it hard to reconstruct the tree, because we can't tell which duplicated Traceback
objects are supposed to represent the "same" point in the tree.
OTOH, what if we switch it around, and make tracebacks a singly-linked list from inner->outer?
@dataclass
class Traceback:
tb_frame: Frame
tb_outer: Optional[Traceback]
...
@dataclass
class BaseException:
# Head pointer, used to access the full traceback
__traceback_innermost__: Optional[Traceback]
# Tail pointer, only used for pushing new frames onto the traceback
__traceback_outermost__: Optional[Traceback]
@property
def __traceback__(self) -> Optional[Traceback]:
# reconstruct old-style traceback for backwards compat
...
...
# Used as:
#
# exc -> innermost Traceback -> Traceback -> ... -> outermost Traceback
def push_frame(exc, frame):
new_tb = Traceback(tb_frame=frame, tb_outer=None)
exc.__traceback_outermost__.tb_outer = new_tb
exc.__traceback_outermost__ = new_tb
Now when raising multiple exceptions at once, we can be clever:
# Pseudocode for the implementation of 'raise *(exc1, exc2, exc3)'.
#
# Sets up a tree structure like:
#
# exc1 -> innermost Traceback -> ... -> Traceback +
# \
# exc2 -> innermost Traceback -> ... -> Traceback ---> new outermost Traceback
# /
# exc3 -> innermost Traceback -> ... -> Traceback +
def multi_raise(excs, frame):
join_tb = Traceback(tb_frame=frame, tb_outer=None)
for exc in excs:
exc.__traceback_outermost__.tb_outer = join_tb
exc.__traceback_outermost__ = join_tb
wrapper = ExceptionGroup(excs)
wrapper.__traceback_outermost__ = join_tb
tstate->cur_exc = wrapper
Now as wrapper
propagates outwards, running push_frame
on it will automatically mutate all of the contained exceptions, in O(1) time. And, given a set of exception objects, we can easily reconstruct the full traceback tree by checking their __traceback__
chains for duplicate Traceback
objects.
I think we'd also want to allow frameworks to attach extra metadata to Traceback
objects to note where the exception passed between tasks etc., but that's straightforward.
Now that I've written this out, it seems pretty attractive! Our previous designs were based around the idea that you can't change the interpreter, so we were stuck with the existing traceback representation, and the existing hard-coded push-frame-to-traceback code. That's why we needed the ExceptionGroup
objects to live in a tree, so each ExceptionGroup
node could one linear segment of traceback objects, with the ExceptionGroup
objects acting as the join nodes. (And secondarily, to act as a place to attach extra metadata about the joins.) If we can change the traceback representation, then a lot of these contortions become unnecessary.
(From discord)
Irit Today at 3:55 PM
@njs Can we do this? we leave traceback as it is, but create a new subclass of Traceback called TracebackGroup and that has a mapping from exception to tb_next. So you do tbg.tb_next[exc] or something like that. Then we have the complete traceback structure but nothing needs to change for code that doesn't use groups.
njs Today at 4:04 PM
@Irit So like to walk through a traceback, you'd do exc.traceback, exc.traceback.tb_next, exc.traceback.tb_next.tb_next, oh oops this next one is a TracebackGroup: exc.traceback.tb_next.tb_next.tb_next[exc], etc.?
I think this would mean that you can go from one exception to all the other linked exceptions, right? With something like exc.traceback.tb_next.keys()? which means that holding any one exception pins all the other ones in memory, including ones that were already caught, and all of their stack frames, and all of their local variables. I don't think that will be ok.
Irit Today at 4:06 PM
or make that a weak map so it doesn't hold on the exception you forgot about?
njs Today at 4:07 PM
hmm, maybe
currently I don't think built-in exceptions are weak-referenceable, so that would need to be changed. And it might cause some disruption if you can no longer take a single Traceback object and iterate over all the traceback frames (now you'd also need to know which exception you were looking for, so you know which branch to take when you hit a TracebackGroup). So like, some of the APIs in the traceback module would need to change. I'm not immediately thinking of any reasons why it couldn't possibly be made to work though, just a question of how much these bits of awkwardness compare to the other alternatives.
Irit Today at 4:12 PM
You're right, the traversals etc will need to take an optional exc arg, and if this arg is None and you hit a traceback group it fails with some error