billfitzgerald/pathways_to_nudity.txt

## pathways_to_nudity.txt
fying pathways to harmful groups about nudity

Identifying pathways to harmful
groups about nudity

A key component of the Drebbel system is to discover pathways to harmful entities a user might
take when engaging with our recommendation surfaces. As part of this effort, we have built a
workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
apply this workflow to focus on groups considered harmful due to nudity and sexual activity.

e Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
to non-rec Groups. We should consider interventions that are either targeted towards
users in these gateway groups, or at the entity-level in order to prevent these
downstream connections from happening.

Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
level demotion. We are working with the Deamplification team to pursue experiments
both at entity-level and at the edge-level.

We should stress however, that not a// gateway groups are potentially problematic in
and of themselves; we should use other signals of harm (e.g., number of members

flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
that we want to consider enforcing on more aggressively.

+." addition, we believe Gateway groups can be used as (sparse) features to improve
fecall of existing models. We are working with the Entit
evaluate models using these groups as features.

~
”
LW
oc
oO
Zz
O
O
ao
O
LL
y & Actor Understanding team to Q

Lu
=
O
<
Q
Lu
oc

n Gateway groups

ays to harmful entities, we wanted to explore the question “Are there
and increased the probability of a user joining harmful roups


2" We call
Identifying pathways to harmful
groups about nudity

A key component of the Drebbel system is to discover pathways to harmful entities a user might
take when engaging with our recommendation surfaces. As part of this effort, we have built a
workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
apply this-workflow to focus on groups considered harmful due to nudity and sexual activity.

¢ Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
to non-rec Groups. We should consider interventions that are either targeted towards
users in these gateway groups, or at the entity-level in order to prevent these
downstream connections from happening.

* Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
level demotion. We are working with the Deamplification team to pursue experiments
both at entity-level and at the edge-level.

e We should stress however, that not a// gateway groups are potentially problematic in
and of themselves; we should use other signals of harm (e.g., number of members

flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
that we want to consider enforcing on more aggressively.

. eo addition, we believe Gateway groups can be used as (sparse) features to improve

recall of existing models. We are working with the Entity & Actor Understanding team to
evaluate models using these groups as features.

pn Gateway groups


ways to harmful entities, we wanted to explore the question “Are there
a and increased the probability of a user joining harmful groups?” We call

REDACTED FOR CONGRESS


“_ In addition, we believe Gateway groups can be used as (sparse) features to improve
recall of existing models. We are working with the Entity & Actor Understanding team to

evaluate models using these groups as features.

Quick refresher on Gateway groups

As part of studying pathways to harmful entities, we wanted to explore the question “Are there
groups that facilitated and increased the probability of a user joining harmful groups?” We call
such groups gateway groups as they often lead people to join harmful groups.

Here, we provide a brief overview of how we detect gateway groups. For thorough details see this
note.

Probability of joining
harmful groups
spikes after joining
gateway group

Group


JOM ”
7 Y
Ww
ag
©
Zz
Oo
O
oa
Oo
LL
Our evaluates ui
joning » herr ocr conte Model — in fn
harmful eae am detect =
the gateway groups ]
QO
Lu
a

To answer the

peas

question, we first build a classifier that, given a list of groups joined by an user, can
gh a ; the user will end up joining a given targe


ecurac hethe
To answer the question, we first build a classifier that, given a list of groups joined by an user, Can
predict with high accuracy whether the user will end up joining a given target harmful group. For 4
particular user, after every group they join, we evaluate the probability of them joining a harmful
group in the future. If this probability spikes after a group join, that is a sign that the group just
joined might be a gateway. If this spike happens for multiple users, after joining the same group,
we identify it as a gateway group.

For this note, we used as the set of target groups those based in US with at least 60 content-level
strikes for nudity and sexual activity in the month of March (source table

@au_ nudity _sexual_activity_strike_harm_source: integrity)

What pathways lead from gateway groups to harmful nudity groups?

source 7 num 7 confirmed_joins
gysj 1326540 1234089
w”
Y)
mobile_group_join 800422 737317 Lu
oe ag
mobile_add_members 653997 408187 2
. ©
470540 423893 O
oc
search 247682 225847 O
Le
group_mall 239872 207585 |
newsfeed_story_header 208814 185000 5
<x
newsfeed
|_reshared_story 202309 182748 =
lead from gateway groups to harmful nudity groups?


7 num ¢  confirmed_joins
1326540 1234089 Li
800422 737317
653997 408187
470540 423893
247682 225847


239872 207585


208814 185000


_ 202309 182748


182315 166570
132268
132268 120918
106177 93785
88839 58065
61462 54135
45458 43628
enger_group_attachment 38879 35208


re sources of j joins of gateway group members to target harmful groups over all time. We


7 num #  confirmed_joins


320524


268211


_ 251610
Nia i lat la


— 149706 151795
newsfeed_story_header 148850 134951
newsfeed_reshared_story 142128 127599
mobile_add_members 118133 63896
Siam ptiachmert 62775 55977

groups_discover_tab 45399 38031
permalink 40290 35186
__Search 35605 29506
22375 18304
30 ‘é
“)
21895 19170 uw
©
16014 es
14232 Z
O
10827 5444 z


J a pathway from nudity gateway groups to other non-rec groups?

-e Users in gateway groups subsequently join non-rec groups because of exposure to

GYSJ recommendations

Results
¢ 10.77% of users who joined one of the top 100 gateway groups (ranked by highest
gateway score) we identify, eventually joined a non-rec group through exposure to
GYSJ vs. 8.78% of those who had no exposure to GYSJ

_ Mitigations

i
'* We should consider filtering out the top gateway groups from GYSJ seeds

teway groups being targeted by “super-inviters"?

e a big source of invitations from gateway groups


red in PYMI invitations join more non-rec groups

s join more non-rec groups through PYMK (friending > inv


‘© 35% of invites (~730K) to these harmful groups went to members after they joined one
‘of the top 100 gateway groups. Of these 730K invites, 20% came from “super-inviters”

* We did not see evidence supporting the PYMI hypothesis; roughly equal fractions of
users between control and testing in the long-term PYMI holdout eventually joined non-
Me

4 rec groups.

| F * We also did not see enough evidence to suggest that PYMK influences connections to
harmful groups either through featuring more users as candidates or showing them
more friend recommendations

___* Introduce feature limits on super-inviters, e.g., number of bulk invites that can be sent
it by super-inviters. We can make this more targeted by focusing only on invites going

to users in a gateway group but this is a more intrusive enforcement and would
sre thought about how we communicate this intervention to the actor.


Non-rec groups

themselves good predictors of non-rec groups

groups for the nudity harm target list, 47 are
Results
e Out of the top 100 gateway groups for the nudity harm target list, 47 are correctly
labeled non-rec; importantly, 42 of these were labeled as non-rec after the workflow
ran. Although the model is not intended for predicting overall non-rec signal (the model
is trained on a specific subset of harm strikes — nudity & sexual activity — and so
would miss out on groups determined non-rec for other harms), this is nonetheless a
strong indicator of how important the model could be as a signal upstream

Mitigations
* We should use gateway groups as a (sparse) feature powering our entity models for

determining non-amplifiable and non-rec entities.

e inconjunction with other signals, such as content strike roll-ups, number of non-rec
members, entity strikes, we can pursue entity-level demotions. Our signal has high
correlation with the number of group members considered non-rec and has positive
correlation with other signals such as strikes and the CPI non-amplifiable flag


”

”

1.0 ee

gateway_score 0.079 0.23 -0.031 0.052 0.025 0.085 5
ci_ri_strikes 08 ©

O

num_nr_members oa
06 O

ci_risevere strikes BIR SMUET: LL
Q

group demote Buiusya oF
O

* Teme 0.025 0.31 7 o2 <&
Q

non_rec BUM Lu

a

0.0
members, entity strikes, we can pursue entity-ievel Gemotions. Uur signal nas nigh
correlation with the number of group members considered non-rec and has positive
correlation with other signals such as strikes and the CPI non-amplifiable flag.

1.0

gateway score 0.079 0.23 -0.031 0.052 0.025 0.085 ‘i
ci ri strikes . oct} 0.68 emcee 10 Moe] 0.8
num_nr_members ‘ P 0.11 0.17 0.12 06


ci_ri_severe_strikes , 0.11 0.35 0.37
F 0.4
group demote 5 Oy WM (0) Ss. LORets}
non_amp F 0.12 0.37 : A 0.2
non_rec : 0.082 0.25
i) 0.0

ov o wi ® Vv a o
L j i y
o ne v MX ° —E te
U = 2 = c © :
av 5 S Pw} & I c
I av c an © c °
> et w ! . So fe
o = = v ay &
& 5 $ 2
a ¢ > °o
© \ % 5
a = “ o
= i
A =
G

REDACTED FOR CONGRESS


arout

From an ads perspective this might
be an interesting feature to identify
advertisers, business, or other
commercial entities that might be
worth enforcing against.


in case you see
additional uses or other folks to
tag.

Also I'm’going to call it here and


REDACTED FOR CONGRESS
From an ads perspective this might
be an interesting feature to identify
advertisers, business, Of other
commercial entities that might be
worth enforcing against

in case you see
additional uses or other folks to
tag.

Also I'm going to call it here and
now that ABP will become ABC at
some point cause advertisers,
business, and commerce just kinda
rolls off the tongue better.

Oo


thanks for the tag.
are you already connected
with business integrity (Bl)?
Within BI, you probably want
to talk to 2 groups:

1. enforcement folks (I
assume we also have rules
against nudity in ads)

2. actor level enforcement
a
there are ad accounts,
advertisers etc. that you've
identified are problematic.

Additionally, you might find
some pages integrity folks
helpful, I'm not sure who is

the right person but start with
as fou aren't

REDACTED FOR CONGRESS
are OU sires t

th business integrity (Bl)?
Within BI, you probably want
to talk to 2 groups
1. enforcement folks
assume we also have rules
against nuaity In ads)

(PM a). if
there are ad accounts
advertisers etc. that you've

identified are problematic.
Additionally, you might find

some pages integrity folks
helpful, I'm not sure who Is

the right person but start with

Jan Kodovsky if you aren't
already in contact with them.

o

Like - Reply
= es ....:-.

aspect we're studying in Drebbel -
gateway entities along the path to
harmful end states

Like - Reply

This is super interesting, how
transferable is this approach to
other areas with gateway groups?
Wondering if we can leverage this

— for violence cc

Like Reply id ©

= a. workflow is
omain independent and

REDACTED FOR CONGRESS


Additionally, you might find
some pages integrity folks
helpful, I'm not sure who Is
the right person but start with
if you aren't
already in contact with them.


Like - Reply
aspect we're studying in Drebbel -

gateway entities along the path to
harmful end states


another

Like - Reply

a This is super interesting, how

transferable is this approach to
other areas with gateway groups?
Wondering if we can leverage this

approach for violence cc hl
©


Like . Reply

a a. workflow is

domain independent and
finds gateway groups for any
given set of target groups.
We are already using it to find
gateways for the militia
network in Ethiopia. We are
looking for other areas to
apply this workflow on and
would be great to collaborate!

Wo

Like . Reply - 1d

th | | Write a reply... > @


REDACTED FOR CONGRESS

©
	fying pathways to harmful groups about nudity

	Identifying pathways to harmful
	groups about nudity

	A key component of the Drebbel system is to discover pathways to harmful entities a user might
	take when engaging with our recommendation surfaces. As part of this effort, we have built a
	workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
	apply this workflow to focus on groups considered harmful due to nudity and sexual activity.

	e Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
	to non-rec Groups. We should consider interventions that are either targeted towards
	users in these gateway groups, or at the entity-level in order to prevent these
	downstream connections from happening.

	Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
	level demotion. We are working with the Deamplification team to pursue experiments
	both at entity-level and at the edge-level.

	We should stress however, that not a// gateway groups are potentially problematic in
	and of themselves; we should use other signals of harm (e.g., number of members

	flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
	that we want to consider enforcing on more aggressively.

	+." addition, we believe Gateway groups can be used as (sparse) features to improve
	fecall of existing models. We are working with the Entit
	evaluate models using these groups as features.

	~
	”
	LW
	oc
	oO
	Zz
	O
	O
	ao
	O
	LL
	y & Actor Understanding team to Q

	Lu
	=
	O
	<
	Q
	Lu
	oc

	n Gateway groups

	ays to harmful entities, we wanted to explore the question “Are there
	and increased the probability of a user joining harmful roups



	2" We call
	Identifying pathways to harmful
	groups about nudity

	A key component of the Drebbel system is to discover pathways to harmful entities a user might
	take when engaging with our recommendation surfaces. As part of this effort, we have built a
	workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
	apply this-workflow to focus on groups considered harmful due to nudity and sexual activity.

	¢ Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
	to non-rec Groups. We should consider interventions that are either targeted towards
	users in these gateway groups, or at the entity-level in order to prevent these
	downstream connections from happening.

	* Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
	level demotion. We are working with the Deamplification team to pursue experiments
	both at entity-level and at the edge-level.

	e We should stress however, that not a// gateway groups are potentially problematic in
	and of themselves; we should use other signals of harm (e.g., number of members

	flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
	that we want to consider enforcing on more aggressively.

	. eo addition, we believe Gateway groups can be used as (sparse) features to improve

	recall of existing models. We are working with the Entity & Actor Understanding team to
	evaluate models using these groups as features.

	pn Gateway groups




	ways to harmful entities, we wanted to explore the question “Are there
	a and increased the probability of a user joining harmful groups?” We call

	REDACTED FOR CONGRESS


	“_ In addition, we believe Gateway groups can be used as (sparse) features to improve
	recall of existing models. We are working with the Entity & Actor Understanding team to

	evaluate models using these groups as features.

	Quick refresher on Gateway groups

	As part of studying pathways to harmful entities, we wanted to explore the question “Are there
	groups that facilitated and increased the probability of a user joining harmful groups?” We call
	such groups gateway groups as they often lead people to join harmful groups.

	Here, we provide a brief overview of how we detect gateway groups. For thorough details see this
	note.

	Probability of joining
	harmful groups
	spikes after joining
	gateway group

	Group



	JOM ”
	7 Y
	Ww
	ag
	©
	Zz
	Oo
	O
	oa
	Oo
	LL
	Our evaluates ui
	joning » herr ocr conte Model — in fn
	harmful eae am detect =
	the gateway groups ]
	QO
	Lu
	a

	To answer the

	peas

	question, we first build a classifier that, given a list of groups joined by an user, can
	gh a ; the user will end up joining a given targe



	ecurac hethe
	To answer the question, we first build a classifier that, given a list of groups joined by an user, Can
	predict with high accuracy whether the user will end up joining a given target harmful group. For 4
	particular user, after every group they join, we evaluate the probability of them joining a harmful
	group in the future. If this probability spikes after a group join, that is a sign that the group just
	joined might be a gateway. If this spike happens for multiple users, after joining the same group,
	we identify it as a gateway group.

	For this note, we used as the set of target groups those based in US with at least 60 content-level
	strikes for nudity and sexual activity in the month of March (source table

	@au_ nudity _sexual_activity_strike_harm_source: integrity)

	What pathways lead from gateway groups to harmful nudity groups?

	source 7 num 7 confirmed_joins
	gysj 1326540 1234089
	w”
	Y)
	mobile_group_join 800422 737317 Lu
	oe ag
	mobile_add_members 653997 408187 2
	. ©
	470540 423893 O
	oc
	search 247682 225847 O
	Le
	group_mall 239872 207585 \|
	newsfeed_story_header 208814 185000 5
	<x
	newsfeed
	\|_reshared_story 202309 182748 =
	lead from gateway groups to harmful nudity groups?














	7 num ¢ confirmed_joins
	1326540 1234089 Li
	800422 737317
	653997 408187
	470540 423893
	247682 225847






	239872 207585



	208814 185000



	_ 202309 182748





	182315 166570
	132268
	132268 120918
	106177 93785
	88839 58065
	61462 54135
	45458 43628
	enger_group_attachment 38879 35208



	re sources of j joins of gateway group members to target harmful groups over all time. We



	7 num # confirmed_joins









	320524




	268211



	_ 251610
	Nia i lat la













	— 149706 151795
	newsfeed_story_header 148850 134951
	newsfeed_reshared_story 142128 127599
	mobile_add_members 118133 63896
	Siam ptiachmert 62775 55977

	groups_discover_tab 45399 38031
	permalink 40290 35186
	__Search 35605 29506
	22375 18304
	30 ‘é
	“)
	21895 19170 uw
	©
	16014 es
	14232 Z
	O
	10827 5444 z

















	J a pathway from nudity gateway groups to other non-rec groups?

	-e Users in gateway groups subsequently join non-rec groups because of exposure to

	GYSJ recommendations

	Results
	¢ 10.77% of users who joined one of the top 100 gateway groups (ranked by highest
	gateway score) we identify, eventually joined a non-rec group through exposure to
	GYSJ vs. 8.78% of those who had no exposure to GYSJ

	_ Mitigations

	i
	'* We should consider filtering out the top gateway groups from GYSJ seeds

	teway groups being targeted by “super-inviters"?

	e a big source of invitations from gateway groups



	red in PYMI invitations join more non-rec groups

	s join more non-rec groups through PYMK (friending > inv


	‘© 35% of invites (~730K) to these harmful groups went to members after they joined one
	‘of the top 100 gateway groups. Of these 730K invites, 20% came from “super-inviters”

	* We did not see evidence supporting the PYMI hypothesis; roughly equal fractions of
	users between control and testing in the long-term PYMI holdout eventually joined non-
	Me

	4 rec groups.

	\| F * We also did not see enough evidence to suggest that PYMK influences connections to
	harmful groups either through featuring more users as candidates or showing them
	more friend recommendations

	___* Introduce feature limits on super-inviters, e.g., number of bulk invites that can be sent
	it by super-inviters. We can make this more targeted by focusing only on invites going

	to users in a gateway group but this is a more intrusive enforcement and would
	sre thought about how we communicate this intervention to the actor.



	Non-rec groups

	themselves good predictors of non-rec groups

	groups for the nudity harm target list, 47 are
	Results
	e Out of the top 100 gateway groups for the nudity harm target list, 47 are correctly
	labeled non-rec; importantly, 42 of these were labeled as non-rec after the workflow
	ran. Although the model is not intended for predicting overall non-rec signal (the model
	is trained on a specific subset of harm strikes — nudity & sexual activity — and so
	would miss out on groups determined non-rec for other harms), this is nonetheless a
	strong indicator of how important the model could be as a signal upstream

	Mitigations
	* We should use gateway groups as a (sparse) feature powering our entity models for

	determining non-amplifiable and non-rec entities.

	e inconjunction with other signals, such as content strike roll-ups, number of non-rec
	members, entity strikes, we can pursue entity-level demotions. Our signal has high
	correlation with the number of group members considered non-rec and has positive
	correlation with other signals such as strikes and the CPI non-amplifiable flag





	”

	”

	1.0 ee

	gateway_score 0.079 0.23 -0.031 0.052 0.025 0.085 5
	ci_ri_strikes 08 ©

	O

	num_nr_members oa
	06 O

	ci_risevere strikes BIR SMUET: LL
	Q

	group demote Buiusya oF
	O

	* Teme 0.025 0.31 7 o2 <&
	Q

	non_rec BUM Lu

	a

	0.0
	members, entity strikes, we can pursue entity-ievel Gemotions. Uur signal nas nigh
	correlation with the number of group members considered non-rec and has positive
	correlation with other signals such as strikes and the CPI non-amplifiable flag.

	1.0

	gateway score 0.079 0.23 -0.031 0.052 0.025 0.085 ‘i
	ci ri strikes . oct} 0.68 emcee 10 Moe] 0.8
	num_nr_members ‘ P 0.11 0.17 0.12 06



	ci_ri_severe_strikes , 0.11 0.35 0.37
	F 0.4
	group demote 5 Oy WM (0) Ss. LORets}
	non_amp F 0.12 0.37 : A 0.2
	non_rec : 0.082 0.25
	i) 0.0

	ov o wi ® Vv a o
	L j i y
	o ne v MX ° —E te
	U = 2 = c © :
	av 5 S Pw} & I c
	I av c an © c °
	> et w ! . So fe
	o = = v ay &
	& 5 $ 2
	a ¢ > °o
	© \ % 5
	a = “ o
	= i
	A =
	G

	REDACTED FOR CONGRESS


	arout

	From an ads perspective this might
	be an interesting feature to identify
	advertisers, business, or other
	commercial entities that might be
	worth enforcing against.



	in case you see
	additional uses or other folks to
	tag.

	Also I'm’going to call it here and



	REDACTED FOR CONGRESS
	From an ads perspective this might
	be an interesting feature to identify
	advertisers, business, Of other
	commercial entities that might be
	worth enforcing against

	in case you see
	additional uses or other folks to
	tag.

	Also I'm going to call it here and
	now that ABP will become ABC at
	some point cause advertisers,
	business, and commerce just kinda
	rolls off the tongue better.

	Oo




	thanks for the tag.
	are you already connected
	with business integrity (Bl)?
	Within BI, you probably want
	to talk to 2 groups:

	1. enforcement folks (I
	assume we also have rules
	against nudity in ads)

	2. actor level enforcement
	a
	there are ad accounts,
	advertisers etc. that you've
	identified are problematic.

	Additionally, you might find
	some pages integrity folks
	helpful, I'm not sure who is

	the right person but start with
	as fou aren't

	REDACTED FOR CONGRESS
	are OU sires t

	th business integrity (Bl)?
	Within BI, you probably want
	to talk to 2 groups
	1. enforcement folks
	assume we also have rules
	against nuaity In ads)

	(PM a). if
	there are ad accounts
	advertisers etc. that you've

	identified are problematic.
	Additionally, you might find

	some pages integrity folks
	helpful, I'm not sure who Is

	the right person but start with

	Jan Kodovsky if you aren't
	already in contact with them.

	o

	Like - Reply
	= es ....:-.

	aspect we're studying in Drebbel -
	gateway entities along the path to
	harmful end states

	Like - Reply

	This is super interesting, how
	transferable is this approach to
	other areas with gateway groups?
	Wondering if we can leverage this

	— for violence cc

	Like Reply id ©

	= a. workflow is
	omain independent and

	REDACTED FOR CONGRESS


	Additionally, you might find
	some pages integrity folks
	helpful, I'm not sure who Is
	the right person but start with
	if you aren't
	already in contact with them.




	Like - Reply
	aspect we're studying in Drebbel -

	gateway entities along the path to
	harmful end states



	another

	Like - Reply

	a This is super interesting, how

	transferable is this approach to
	other areas with gateway groups?
	Wondering if we can leverage this

	approach for violence cc hl
	©



	Like . Reply

	a a. workflow is

	domain independent and
	finds gateway groups for any
	given set of target groups.
	We are already using it to find
	gateways for the militia
	network in Ethiopia. We are
	looking for other areas to
	apply this workflow on and
	would be great to collaborate!

	Wo

	Like . Reply - 1d

	th \| \| Write a reply... > @



	REDACTED FOR CONGRESS

	©