Skip to content

Instantly share code, notes, and snippets.

@billfitzgerald
Created October 27, 2021 02:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save billfitzgerald/4be7d2134d8eb111110a79b35be003c2 to your computer and use it in GitHub Desktop.
Save billfitzgerald/4be7d2134d8eb111110a79b35be003c2 to your computer and use it in GitHub Desktop.
Doc 5: Rough OCR of Facebook Files released by Gizmodo: https://gizmodo.com/hey-kid-wanna-see-some-leaked-facebook-docs-1847936740
fying pathways to harmful groups about nudity
Identifying pathways to harmful
groups about nudity
A key component of the Drebbel system is to discover pathways to harmful entities a user might
take when engaging with our recommendation surfaces. As part of this effort, we have built a
workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
apply this workflow to focus on groups considered harmful due to nudity and sexual activity.
e Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
to non-rec Groups. We should consider interventions that are either targeted towards
users in these gateway groups, or at the entity-level in order to prevent these
downstream connections from happening.
Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
level demotion. We are working with the Deamplification team to pursue experiments
both at entity-level and at the edge-level.
We should stress however, that not a// gateway groups are potentially problematic in
and of themselves; we should use other signals of harm (e.g., number of members
flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
that we want to consider enforcing on more aggressively.
+." addition, we believe Gateway groups can be used as (sparse) features to improve
fecall of existing models. We are working with the Entit
evaluate models using these groups as features.
~
LW
oc
oO
Zz
O
O
ao
O
LL
y & Actor Understanding team to Q
Lu
=
O
<
Q
Lu
oc
n Gateway groups
ays to harmful entities, we wanted to explore the question “Are there
and increased the probability of a user joining harmful roups
2" We call
Identifying pathways to harmful
groups about nudity
A key component of the Drebbel system is to discover pathways to harmful entities a user might
take when engaging with our recommendation surfaces. As part of this effort, we have built a
workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
apply this-workflow to focus on groups considered harmful due to nudity and sexual activity.
¢ Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
to non-rec Groups. We should consider interventions that are either targeted towards
users in these gateway groups, or at the entity-level in order to prevent these
downstream connections from happening.
* Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
level demotion. We are working with the Deamplification team to pursue experiments
both at entity-level and at the edge-level.
e We should stress however, that not a// gateway groups are potentially problematic in
and of themselves; we should use other signals of harm (e.g., number of members
flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
that we want to consider enforcing on more aggressively.
. eo addition, we believe Gateway groups can be used as (sparse) features to improve
recall of existing models. We are working with the Entity & Actor Understanding team to
evaluate models using these groups as features.
pn Gateway groups
ways to harmful entities, we wanted to explore the question “Are there
a and increased the probability of a user joining harmful groups?” We call
REDACTED FOR CONGRESS
“_ In addition, we believe Gateway groups can be used as (sparse) features to improve
recall of existing models. We are working with the Entity & Actor Understanding team to
evaluate models using these groups as features.
Quick refresher on Gateway groups
As part of studying pathways to harmful entities, we wanted to explore the question “Are there
groups that facilitated and increased the probability of a user joining harmful groups?” We call
such groups gateway groups as they often lead people to join harmful groups.
Here, we provide a brief overview of how we detect gateway groups. For thorough details see this
note.
Probability of joining
harmful groups
spikes after joining
gateway group
Group
JOM ”
7 Y
Ww
ag
©
Zz
Oo
O
oa
Oo
LL
Our evaluates ui
joning » herr ocr conte Model — in fn
harmful eae am detect =
the gateway groups ]
QO
Lu
a
To answer the
peas
question, we first build a classifier that, given a list of groups joined by an user, can
gh a ; the user will end up joining a given targe
ecurac hethe
To answer the question, we first build a classifier that, given a list of groups joined by an user, Can
predict with high accuracy whether the user will end up joining a given target harmful group. For 4
particular user, after every group they join, we evaluate the probability of them joining a harmful
group in the future. If this probability spikes after a group join, that is a sign that the group just
joined might be a gateway. If this spike happens for multiple users, after joining the same group,
we identify it as a gateway group.
For this note, we used as the set of target groups those based in US with at least 60 content-level
strikes for nudity and sexual activity in the month of March (source table
@au_ nudity _sexual_activity_strike_harm_source: integrity)
What pathways lead from gateway groups to harmful nudity groups?
source 7 num 7 confirmed_joins
gysj 1326540 1234089
w”
Y)
mobile_group_join 800422 737317 Lu
oe ag
mobile_add_members 653997 408187 2
. ©
470540 423893 O
oc
search 247682 225847 O
Le
group_mall 239872 207585 |
newsfeed_story_header 208814 185000 5
<x
newsfeed
|_reshared_story 202309 182748 =
lead from gateway groups to harmful nudity groups?
7 num ¢ confirmed_joins
1326540 1234089 Li
800422 737317
653997 408187
470540 423893
247682 225847
239872 207585
208814 185000
_ 202309 182748
182315 166570
132268
132268 120918
106177 93785
88839 58065
61462 54135
45458 43628
enger_group_attachment 38879 35208
re sources of j joins of gateway group members to target harmful groups over all time. We
7 num # confirmed_joins
320524
268211
_ 251610
Nia i lat la
— 149706 151795
newsfeed_story_header 148850 134951
newsfeed_reshared_story 142128 127599
mobile_add_members 118133 63896
Siam ptiachmert 62775 55977
groups_discover_tab 45399 38031
permalink 40290 35186
__Search 35605 29506
22375 18304
30 ‘é
“)
21895 19170 uw
©
16014 es
14232 Z
O
10827 5444 z
J a pathway from nudity gateway groups to other non-rec groups?
-e Users in gateway groups subsequently join non-rec groups because of exposure to
GYSJ recommendations
Results
¢ 10.77% of users who joined one of the top 100 gateway groups (ranked by highest
gateway score) we identify, eventually joined a non-rec group through exposure to
GYSJ vs. 8.78% of those who had no exposure to GYSJ
_ Mitigations
i
'* We should consider filtering out the top gateway groups from GYSJ seeds
teway groups being targeted by “super-inviters"?
e a big source of invitations from gateway groups
red in PYMI invitations join more non-rec groups
s join more non-rec groups through PYMK (friending > inv
‘© 35% of invites (~730K) to these harmful groups went to members after they joined one
‘of the top 100 gateway groups. Of these 730K invites, 20% came from “super-inviters”
* We did not see evidence supporting the PYMI hypothesis; roughly equal fractions of
users between control and testing in the long-term PYMI holdout eventually joined non-
Me
4 rec groups.
| F * We also did not see enough evidence to suggest that PYMK influences connections to
harmful groups either through featuring more users as candidates or showing them
more friend recommendations
___* Introduce feature limits on super-inviters, e.g., number of bulk invites that can be sent
it by super-inviters. We can make this more targeted by focusing only on invites going
to users in a gateway group but this is a more intrusive enforcement and would
sre thought about how we communicate this intervention to the actor.
Non-rec groups
themselves good predictors of non-rec groups
groups for the nudity harm target list, 47 are
Results
e Out of the top 100 gateway groups for the nudity harm target list, 47 are correctly
labeled non-rec; importantly, 42 of these were labeled as non-rec after the workflow
ran. Although the model is not intended for predicting overall non-rec signal (the model
is trained on a specific subset of harm strikes — nudity & sexual activity — and so
would miss out on groups determined non-rec for other harms), this is nonetheless a
strong indicator of how important the model could be as a signal upstream
Mitigations
* We should use gateway groups as a (sparse) feature powering our entity models for
determining non-amplifiable and non-rec entities.
e inconjunction with other signals, such as content strike roll-ups, number of non-rec
members, entity strikes, we can pursue entity-level demotions. Our signal has high
correlation with the number of group members considered non-rec and has positive
correlation with other signals such as strikes and the CPI non-amplifiable flag
1.0 ee
gateway_score 0.079 0.23 -0.031 0.052 0.025 0.085 5
ci_ri_strikes 08 ©
O
num_nr_members oa
06 O
ci_risevere strikes BIR SMUET: LL
Q
group demote Buiusya oF
O
* Teme 0.025 0.31 7 o2 <&
Q
non_rec BUM Lu
a
0.0
members, entity strikes, we can pursue entity-ievel Gemotions. Uur signal nas nigh
correlation with the number of group members considered non-rec and has positive
correlation with other signals such as strikes and the CPI non-amplifiable flag.
1.0
gateway score 0.079 0.23 -0.031 0.052 0.025 0.085 ‘i
ci ri strikes . oct} 0.68 emcee 10 Moe] 0.8
num_nr_members ‘ P 0.11 0.17 0.12 06
ci_ri_severe_strikes , 0.11 0.35 0.37
F 0.4
group demote 5 Oy WM (0) Ss. LORets}
non_amp F 0.12 0.37 : A 0.2
non_rec : 0.082 0.25
i) 0.0
ov o wi ® Vv a o
L j i y
o ne v MX ° —E te
U = 2 = c © :
av 5 S Pw} & I c
I av c an © c °
> et w ! . So fe
o = = v ay &
& 5 $ 2
a ¢ > °o
© \ % 5
a = “ o
= i
A =
G
REDACTED FOR CONGRESS
arout
From an ads perspective this might
be an interesting feature to identify
advertisers, business, or other
commercial entities that might be
worth enforcing against.
in case you see
additional uses or other folks to
tag.
Also I'm’going to call it here and
REDACTED FOR CONGRESS
From an ads perspective this might
be an interesting feature to identify
advertisers, business, Of other
commercial entities that might be
worth enforcing against
in case you see
additional uses or other folks to
tag.
Also I'm going to call it here and
now that ABP will become ABC at
some point cause advertisers,
business, and commerce just kinda
rolls off the tongue better.
Oo
thanks for the tag.
are you already connected
with business integrity (Bl)?
Within BI, you probably want
to talk to 2 groups:
1. enforcement folks (I
assume we also have rules
against nudity in ads)
2. actor level enforcement
a
there are ad accounts,
advertisers etc. that you've
identified are problematic.
Additionally, you might find
some pages integrity folks
helpful, I'm not sure who is
the right person but start with
as fou aren't
REDACTED FOR CONGRESS
are OU sires t
th business integrity (Bl)?
Within BI, you probably want
to talk to 2 groups
1. enforcement folks
assume we also have rules
against nuaity In ads)
(PM a). if
there are ad accounts
advertisers etc. that you've
identified are problematic.
Additionally, you might find
some pages integrity folks
helpful, I'm not sure who Is
the right person but start with
Jan Kodovsky if you aren't
already in contact with them.
o
Like - Reply
= es ....:-.
aspect we're studying in Drebbel -
gateway entities along the path to
harmful end states
Like - Reply
This is super interesting, how
transferable is this approach to
other areas with gateway groups?
Wondering if we can leverage this
— for violence cc
Like Reply id ©
= a. workflow is
omain independent and
REDACTED FOR CONGRESS
Additionally, you might find
some pages integrity folks
helpful, I'm not sure who Is
the right person but start with
if you aren't
already in contact with them.
Like - Reply
aspect we're studying in Drebbel -
gateway entities along the path to
harmful end states
another
Like - Reply
a This is super interesting, how
transferable is this approach to
other areas with gateway groups?
Wondering if we can leverage this
approach for violence cc hl
©
Like . Reply
a a. workflow is
domain independent and
finds gateway groups for any
given set of target groups.
We are already using it to find
gateways for the militia
network in Ethiopia. We are
looking for other areas to
apply this workflow on and
would be great to collaborate!
Wo
Like . Reply - 1d
th | | Write a reply... > @
REDACTED FOR CONGRESS
©
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment