sorcio/comment-to-15776.md Secret

## comment-to-15776.md

      
    Raw
  

              comment-to-15776.md
            
          
    Comments to the discussion thread Proposal: close old feature-request issues (Language
summit follow up).

I understand Irit's proposal to be a pragmatical approach to a (maybe small?) subset of
the general issue tracker SNR problem.
One premise is that the issue tracker is supposed to represent potential items of work
for contributors. Only in that sense the set of open issues can be treated as a backlog;
and contributors can use the open issues as a snapshot of what is known to need work at
the present time. If an issue does not represent work that can be done on CPython, it's
"invalid" in this sense. I assume that there is a general agreement on this premise.
These "invalid" issues contribute to noise as long as they're open, but (and I think
this is one of the first misunderstandings) they are not the only source of noise, as
what is "signal" depends on the listener. I might be looking for bugs that I can tackle
with my fragmented free time and are interesting to me; an expert might be looking for
outstanding issues in the code they maintain; release managers might be interested in
potential blockers; and so on. No big insight here, only that signal and noise are
subjective.
Irit's proposal seems to work on two additional insights. The first is that, while
signal/noise are subjective in general, there are some issues that are obviously noise
for any contributor. Triagers will know this very well. The second insight is that
some pragmatical criteria might be applied to identify a segment of these
obviously-noise issues just by using the metadata available in the issue tracker.
Feature requests that did not receive attention in years might fit the bill on both
accounts. They might be "invalid" in the sense that they never managed to prove they
represent an item of work. And they might be identified just by querying the issue
tracker. Mostly, because additional caution can be applied by humans to implement the
exceptions described in the original suggestion. But the query did a big chunk of the
work.
The details are subject to refinement. The core of the matter is the approach, which is
new for CPython, so it calls for a high-level discussion. There is potential for
diverging opinions, because the topic is intertwined with aspects such as the technical
implementation, everybody's individual experience with triaging and dealing with issues,
the social aspect of communicating with external contributors, and so.
One specific misunderstanding is to interpret the premise to mean that a large number of
open issues is in itself an indication of low SNR, and therefore the metric to optimize
is the number of open issues. While it might be true that it's hard to find information
in the tracker because of the volume, and that a good part of this volume is noise (by
some definition), Irit has been explicit to note that the proposal is not about reducing
numbers. Being more aggressive about closing issues is not on the table. The gist is
that there is already a common background that some issues don't belong to the
tracker, but action does not follow, because this understanding is not shared, and
nobody knows for sure how to act or what others think. Guido mentioned "courage",
because it takes courage to take a decision that might be rejected by others. It's
emotionally taxing, and it can be made easier by explicitly agreeing on a set of
criteria. Irit's other proposal for canned responses addresses another aspect of the
same friction to take action.

I don't have an opinion because I'm not part of this decision. If I'm interpreting the
intentions correctly, this discussion has potential beyond the simple matter of closing
some issues. There is space to agree on one approach and establish some principles that
might help all the workflows the involve the issue tracker.
E.g. if it's established that "no visible progress for a prolonged time implies that no
progress will ever happen", an informal guideline might suggest that progress be made
visible, which might improve signal in a sense. In other words, if someone says "I
want to work on this", they are at the very least suggesting that one contributor
believes that an implementation is possible, and the issue does represent valid work.
This is NOT the case for issues that are pending a design decision, because the design
might lead to a different solution. Design work does not necessarily belong to the issue
tracker. Ezio's worry that self-assigning an issue might discourage external
contributions is valid, but it might be mitigated by communicating more explicitly.
Another consequence is that it might help to better define a commonly-accepted idea of
what is signal or noise. Maybe it's hard to formalize, but it's valuable to note down
some questions to ask to determine whether an issue is valid. The triage team is doing
incredible work, but the operation principles are not fully documented.
A second order consequence might be a better definition of the workflows themselves and
the practical aspects of their implementation. People in this thread voiced concerns
about how to handle one case or another. Part of the concern is how to effectively use
metadata in the issue tracker. In private conversation with Ezio—who is working on areas
of improvement like grouping labels and documenting them—a concern of usability
emerged. If labels are the only way to represent searchable metadata, their number will
tend to flourish, which would make it growingly harder to ensure labels have
discriminating value. There are already suggestions to use milestones and projects (the
only other sources of custom metadata, as far as I can see), which could help. These
improvements will benefit if there is an agreement on the principles. Of course the
development of this aspect is out of scope for this thread; the point is that this
discussion has the potential to inform the principles, and better choices going ahead.