- RFC from mods
Post follows.
TL;DR: Charcoal is the organisation behind SmokeDetector. Since January 2017, we've been casting up to 3 flags automatically on posts that our systems are confident are spam. We'd like to increase that to 5 automatic flags to reduce the time spam spends alive on the network.
Charcoal is a user-run organisation that is primarily responsible for the spam-detecting bot, SmokeDetector. Over the past four years, with the aid of SmokeDetector, we've looked for spam on the Stack Exchange network to flag and destroy manually. In January 2017, with the blessing of Stack Exchange, we started running an "autoflagging" project, wherein our systems automatically cast up to three flags on a post if they're confident that it's spam. If you missed that happening entirely, we wrote a meta post on Meta Stack Exchange - or there's a slightly more concise explanation on our website.
Good. We currently have 215 users who have opted into the autoflagging system (you can sign up too, if you're interested). We've flagged around 30 000 (29 592) posts, of which the vast majority (29 526) were confirmed spam - that's 99.7% accurate.
We'd like to expand our autoflagging system. At present we cast up to 3 flags on posts we're confident are spam; we'd like to increase that number to 4 or 5 flags.
Just so we're up-front about this: this is an experiment. Ultimately, we're trying to do these things:
- Reduce the time that spam spends on the sites before being deleted;
- Lower the number of humans who involuntarily have to see or interact with spam.
Increasing the number of flags we cast automatically on spam should accomplish both of these things:
- Automatic flags are near-instant; manual flags take multiple minutes to be cast - that means that increasing the ratio of automatic to manual flags results in a shorter time before 6 flags accumulate and the spam is deleted.
- Automatic flags are not cast by a human. Fewer humans, therefore, are forced to see/interact with the spam.
The data we have backs this up. In terms of time to deletion, we saw a significant drop in the time it took to delete spam when we started our autoflagging project. Take a look at this graph from the meta post on the subject for an excellent visual representation of that. Before we started autoflagging, spam spent an average of 400 hours per day alive across the network; with autoflagging in place, the average is 10x less, at around 40 hours per day.
If this change goes ahead, these things are likely to happen:
- It will only take 1 or 2 manual flags from users to spam-nuke an autoflagged post, instead of the current 3. Posts that are not autoflagged will, of course, still require 6 flags to nuke.
- There may be an increase in posts spam-nuked entirely by Charcoal members, who may or may not be active on the site.
- You will see a reduction in the time spam spends on the site before being deleted.
- Fewer humans will have to involuntarily see each spam post.
The last two of those are indisputably good things. The first two, however, are more controversial, and are the reason we want to have a discussion here on meta before we make this happen. What follows are the major concerns we've seen, and what we can do about them or why we don't think they're an issue - we'd like to hear your thoughts.
The major thing we're looking for out of this is a reduction in time to deletion. The following graph shows how long spam currently spends alive on the top few sites; we're hoping to see a moderate reduction in the average times, and a significant reduction in the top outliers.
The following graph is from an experiment we've been running over the past week, casting between 1 and 5 flags randomly on each post matching the settings we're considering.
In raw numbers, that's this:
PostCount FlagCount ATTD StdDev CommonMax
55 1 190.5091 197.59 585.69
55 2 85.0182 109.86 304.74
62 3 48.9355 83.57 216.07
68 4 26.9559 51.98 130.92
56 5 10.2143 5.60 21.41
PostCount
is the sample size; FlagCount
the number of flags cast on each post in the sample; ATTD
the average time to deletion, and CommonMax
is the maximum of a 97% confidence interval. The major takeaway from these stats is that we're likely to see a ~5x drop in the average time to deletion, and a ~10x drop in the outliers.
Spam flags are a powerful feature that need some care in applying correctly. This is a concern that came up when we originally built the autoflagging system, so we already have safeguards built in.
- We only flag a post if we're more than 99.5% sure it's spam. (Technically, the precise certainty varies by conditions set by the users whose accounts we use, but it's always above 99.5% - more detail on that on our website).
- If the system breaks down or bugs out and starts flagging things it shouldn't, all Charcoal members and all network moderators have access to a command that immediately halts all flagging activity and requires intervention from a system administrator to re-enable. Outside of testing, that kill-switch has never had to be used.
- We never unilaterally nuke a post. There are currently 3 manual flags required in addition to the automatic flags to nuke a post; this increase proposal still retains at least one manual flag.
We also make sure that everything has human oversight at all times. While only 3 humans currently have to manually flag the post, there are always more users than that reviewing the system's decisions and classifications; if a post is flagged that shouldn't have been, we are alerted and can alert the relevant moderators to resolve the issue. Again, this is very rare: over the past year, we've flagged 66 posts that shouldn't have been, compared to 29 592 spam posts (that's 99.7% accurate overall). We allow users to set their own flagging conditions, provided they don't go below our baseline 99.50% certainty. We recommend, however, a higher value that has a certainty of 100.00% - those who set their conditions below that are likely to see more false positives flagged using their account.
This proposal decreases the required manual involvement to nuke a post; to compensate for that lower human-involvement barrier, we will correspondingly increase the required accuracy before casting the extra automatic flags. For example, we currently require 99.5% accuracy before casting autoflags; we could require 99.9% accuracy for 4 autoflags, and 99.99% accuracy for 5 autoflags. (For reference, humans are accurate 95.4% of the time, or 87.3% on Stack Overflow - those are stats that jmac (a former Community Manager) looked up for us last year when we started autoflagging).
In the rare event of a legitimate post getting autoflagged, we also have systems in place to ensure it isn't accidentally deleted and forgotten about. Multiple people review each post we catch, whether it's autoflagged or not, and classify it as spam or not; if an autoflagged post is classified as not-spam, the system posts an alert to chat to let us know. That lets us ping the necessary people to retract their flags, and keep an eye on the post to make sure it doesn't get deleted.
To make it starkly clear how accurate this could be, here's a visualisation:
That's a chronological representation (left-right, top-bottom) of every post that would have been flagged under the settings we're considering for 5 flags, and whether they were spam (green squares) or legitimate (red squares).
As I said earlier, this proposal reduces the required manual involvement to nuke a post. Since Charcoal members also cast manual flags on top of the automatic flags cast by the system, that's also likely to increase the number of posts that are nuked entirely by Charcoal members, without involvement from users who are active on this site. Some posts already have 6 flags cast on them by Charcoal (including autoflagging and manual flags), but the proportion of posts that applies to is likely to increase.
We don't think this is an issue in terms of subject matter expertise: the spam we see on the Stack Exchange network is broadly the same wherever you go - you don't need any subject matter expertise or activity on a particular site to be able to tell what's spam and what's not. We do, however, recognise that it's possible that a site's community may want to handle its own spam; if that's the case, we're happy to turn the autoflagging system off on this site or to retain it at its current levels.
We want to increase the number of automated flags from 3 to 5 to reduce the time spam spends alive on the network. We'd like to hear your thoughts. We appreciate that quite a lot of the stuff we do at Charcoal is fairly invisible to the sites, so we want to be as open as possible. If you'd like data or specific reports, let us know and we'll try to add them in - we already have a lot of reporting around autoflagging, so it may already exist. If there's other stuff we can do to explain or to help you make an informed decision about whether you want this, drop an answer or a comment on this post. Charcoal members will be hanging around this post to respond to your concerns, or you can also visit us in chat.
1 There are some flags from Charcoal members that don't appear in these statistics, for various reasons, but the majority are there.
"spends alive on the site" would be better as "spends alive on each site" or "spends alive on Stack Exchange sites"
"looked for spam on the network to flag" should be "looked for spam on the Stack Exchange network to flag"
"In late 2016-early 2017" state one. You don't "start" over a range. You might ramp-up over a range, be starting almost always has a definite time, even if you started, stopped, started, stopped, etc.
"(you can sign up too, if you're still reading)" should be "(you can sign up too, if you're interested)" "still reading isn't an actual requirement. As such it comes off as a stale attempt at humor.
"Between them, we've flagged" should be "With them, we've flagged" Using "between" implies either decisiveness, or that you're going to explain how they are separated. "With" is inclusive.
"around 30,000 (29592) posts, of which the vast majority (29526) were" should consistently use, or not, a thousands separator. Try: "almost 30,000 posts (29,592 to be exact), of which the vast majority, 29,526, were"
"We'd like to expand our autoflagging system. " use "increase", not "expand". You're not wanting to expand the system, you're wanting to increase the number of flags cast automatically. Using "expand" implies changing how posts are classified to be auto-flagged.
"are spam; we'd like " use a period. "are spam. We'd like" While these are closely coupled for you, they aren't for your audience.
"spends on the sites before getting deleted" would be better as "spends on the sites before being deleted"
"Lower the number of humans who have spam involuntarily shoved in their faces." is a significant change in tone, and is much less professional than the other portions of the text. You might want to go with something like: "Lower the number of humans who are forced to see, or interact with, spam."
"should accomplish both of these things:" should have a period, not a colon. You're not introducing the next unordered list, you're stating a conclusion. You could add another sentence which introduces the next unordered list. It could wither be longer like: "Increasing the number of flags we cast automatically on spam should accomplish both of these things. Additional auto-flags will accomplish these goals because:", or you could make it all one sentence like "Increasing the number of flags we automatically cast on spam should accomplish both of these things, because:
"Automatic flags are near-instant; manual flags take multiple minutes to be cast. Increasing the ratio of automatic to manual flags logically results in a shorter time before 6 flags accumulate and the spam is deleted." You went into a bulleted list, but then start explaining. Each bullet should state the premise, then justify that assertion. Also, avoid stating an argument as "logically". Stating it that way implies that you feel people won't look at it logically and that they are lesser than you for not doing so. You could do something like:
(I'm not really happy with the wording, but the assertion should be in the first sentence.)
"Automatic flags are not cast by a human. Not as many humans, therefore, have the spam shoved in their faces." Again, both: A) the assertion should be in the first sentence, then justification (if needed); and B) "shoved in their faces" again, is both a jarring change in tone and implies that you're not being that professional about this. You've stated that the goal is to convince people. Being unprofessional is counter to that goal. If such a tone/verbiage is used, which it can be for emphasis, or to drive a point home, it shouldn't be a recurring motif. How about something like: "Fewer humans are forced to see and interact with the spam due to requiring fewer manual flags prior to deletion."
After getting to this point, I've realized that the current ":" after "both of these things:" may have lead me to misunderstand what you're trying to say. Looking back on it, you might be attempting to list the advantages of automatic flags. If so, the bullet points (other than the "shoved in their faces" are) are OK. They just need a sentence which introduces them as the advantages of auto-flags. Something like:
Auto-flags have the following advantages:
"Take a look at this graph from the meta post on the subject" While the graph is nice, it doesn't really do a good job of highlighting the change from 0 autoflags to 1 then 3. The problem with it is that the lead-in is too long and the part of interest is the little bit over on the left. Perhaps put a break in the graph to indicate that you have data for a long period of no autoflagging, but that it's all basically the same.
"(If you prefer numbers, I can instead tell you that before we started autoflagging, an average of 400 person-hours per day were spent on getting spam deleted across the network; with autoflagging in place, the average is 10x less, at around 40 person-hours per day.)" This is really too long for a parenthetical. It's not actually parenthetical to the subject of the paragraph, which is "The data we have backs this up." Just use: "Before we started autoflagging, an average of 400 person-hours per day were spent getting spam deleted across the network. With autoflagging in place, the average is 10x less, at around 40 person-hours per day." (some grammar changed too).
However, is what you're saying there really what you intend? What you're saying is that you have measured the number of person-hours spent working on removing spam (not the time it's visible/existent). I wasn't aware you had this data, and I'm not sure hour you would have reliably obtained it. What I think your trying to say is something like: "Before we started autoflagging, the accumulated time which spam existed was an average of 400 hours per day across the network. With autoflagging in place, the average is 10x less, at around 40 hours per day."
(The wording here still needs a bit of work.)
"What does that mean for sites?" This doesn't match what you're talking about in the section and/or what "that" refers to in unclear. Perhaps: "What does increasing the per post number of autoflags mean for sites?"
"If this change goes ahead, these things are likely to happen:" is unclear/ misses an opportunity to reinforce proposal. Use: "If the number of autoflags per post are increased, the likely results are:" Alternately, "If this proposal moves forward, the likely results are:" Which to use deends on what you've actually titled the Question to be, and/or if you want to take the opportunity to reinforce the major direction of your proposal.
"Posts that are not autoflagged still require 6 flags to nuke." I'd add "of course" Like: "Posts that are not autoflagged will, of course, still require 6 flags to nuke." Not doing something along these lines implicitly confirms to people who've gotten confused that you might be affecting the underlying system that they might have been correct to be thinking that way. Having the "of course" allows people to realize that they were conceptually wrong while keeping the nudge on your part more of a gentle reminder, i.e. implying that they really didn't need it.
"There is likely to be an increase in posts spam-nuked entirely by Charcoal members, who may or may not be active on this site." You've already stated that you're listing the "likely" things (at a minimum, use a synonym), in addition it singles out Charcoal members without needing to and without taking the opportunity to be inclusive. You could go with "There will probably be an increase in the number of posts that are spam-nuked entirely by people monitoring SmokeDetector reports, who may or may not be active on the site where the spam was posted. Currently, SmokeDetector reports into the following chat rooms: Charcoal HQ, ... . If a site requests it, SmokeDetector can report just the spam for that site into a room of the site moderators' choosing, as is being done for SOCVR and Stack Overflow."
More later.