Moderation Tools and User Safety: Data-Driven Approaches at Twitch

GAME UX
SUMMIT ’17
#GAMEUXSUMMIT ‘17 / TORONTO
Toxicity and Moderation:
Data-Based Approaches at Twitch
Ruth Toner
Data Scientist, Twitch Interactive

Introduction
TWITCH:
o Live streaming and on-demand video
o Fourth largest source of internet traffic in the US, mostly (but not only!)
gaming content
 In a single month:
o 2.2 Million broadcasters and content creators, including gamers, esports, devs,
and non-gaming content
o 15 Million Daily Active Viewers
o 1billion+ chat and private messages sent
1/28

Twitch Chat– Why Do We Care?
2/28
Chat:
- Main way users
interact with
broadcaster
- Subscriptions and
“cheering” (tips)
- Key part of funnel to
engaged, paying
viewers
- We want to make
being social on
Twitch a good
experience

Introduction
1 BILLION MESSAGES = HARASSMENT AND
ABUSE HAPPEN
 This talk = how Twitch uses data to understand:
o How abuse happens on Twitch
o How we build better tools to fight it
o How can we combine data science and human insight?
3/28

Human-centric Data Science
 Intelligence Augmentation: “The ultimate goal is not building machines that think
like humans, but designing machines that help humans think better.”
 Guszcza(1), Lewis, Evans-Greenwood “Cognitive collaboration: Why humans and computers think better
together” Deloitte University Press Jan 2017
4/28
Smaller scale
insights
The Sweet Spot
Good Data Science
+ UX
Pure data, but
also “Artificial
Stupidity”1
Pure Qualitative Pure Quantitative

Moderation + Data Science
1. Extent
How do we describe +
quantify abuse on
Twitch?
2. Impact
How do we answer
questions about the
impact of abuse and
our tools?
3. Tools
How do we use data to
build effective tools to
fight abuse?
5/28
The Goal:
• Help our content creators can build the communities they want
(within limits…)
• No one leaves Twitch because they feel unsafe or harassed

1. Extent
FIRST, WE NEED TO
UNDERSTAND OUR
DATA…
6/28
Understanding our data

8/28Any User: Twitch Site-wide Moderation
o Reports are sent
from a user to
Twitch’s site-wide
Human Admin
moderation staff
o These admins can
issue a Strike: a
temporary
suspension or
permanent ban
from Twitch

Data Source: Reports and Strikes
 Safety Violation Signal: TWITCH TERMS OF SERVICE VIOLATIONS
 TOS: Among many other things, basic rules of conduct for broadcasting and chatting
(no harassment, threats, impersonation, etc.)
 A viewer or broadcaster is reported for violating the basic rules of conduct governing
behavior on Twitch, and can receive a strike limiting use of their account.
 Human Judgement:
 Reports: People mislabel spam as harassment. Behavior was bad but didn’t break
ToS. People report each other as a joke.
 Strikes: 100% accurate source of data, but not a complete picture of unsafe
behavior.
9/28

10/28Channel Moderators: Timeouts and Bans
Every channels can appoint
moderators who can:
o Time Out chatters
(temporary)
o Ban chatters (permanent)

Data Source: Timeouts and Bans
 Safety Violation: COMMUNITY RULE BREAKING
 A channel moderator can ban or time-out someone from participating
from chat when they break the rules of a community
We give broadcasters autonomy to decide what conversation is
acceptable in their community (within Terms of Service limits…).
 Human Judgement: Not all rule violations are safety violations.
Moderators also moderate for spam, for links or all-caps, for spoilers, or
(again!) as a joke (“Mods plz ban me!”).
11/28

12/28
Moderator:
Troll:
Troll:
Broadcaster: AutoMod

Data Source: AutoMod
 Safety Violation: UNACCEPTABLE LANGUAGE
 Broadcaster decides how ’risky’ they want language to be on their
channel, from just removing hate speech to forbidding cursing.
 Two Signals:
AutoMod ratings: how risky AutoMod thinks a chat message is.
Mod approvals + denials: what the channel moderators thought.
 Human Judgement: Missing social context for the messages.
13/28

Data from Moderation Tools
 Each Data Source: How safe or happy our viewers or broadcasters feel on Twitch
 BUT ALSO: False Positives, Noise, Unclear Signals
 “A flag is not merely a technical feature: It is a complex interplay between users and
platforms, humans and algorithms, and the social norms and regulatory structures of
social media.”
 Crawford and Gillespie, “What Is A Flag For? Social Media Reporting Tools and the Vocabulary of Complaint” New
Media & Society July 2014
 We understand these signals and noise by exploring data and talking to our users
14/28

Example: Two Types of Abuser
Question: What does a troll look like?
 Chatters suspended for harassment share a few things in
common:
 Multiple channel bans
 Younger than average accounts
 Higher than expected language risk
 However, if we talk to our admins and then take a closer look at
our data, it turns out this question is too simple…
15/28
Account Age:
Regular vs Suspended User

Example: Two Types of Abuser
Better Question: What do different types of
troll look like?
 We see two major subcategories!
 Chat Harassers: Higher risk language, young and old accounts
alike.
 Ban Evader: Younger accounts with low activity and levels of
verification.
 We need different solutions for different types of abuse
 Mixing quantitative analysis and qualitative assessment allowed
us to update our intuition about trolling…
16/28
(Suspended) Account Age:
Ban Evader vs Harasser

Abuse: Impact
NEXT, WE NEED TO
ASK THE RIGHT
QUESTIONS WITH THE
RIGHT TOOLS…
17/28
Measuring impact

Data Science Tools: Questions + Problems
 We want to turn our qualitative user insights into testable hypotheses.
 A/B testing: Causal analysis, but ethical considerations + confusion…
 Better for smaller product iterations or helper tools.
 Quasi-experimental studies: Cheaper, but self selection effects +
confounding variables everywhere!
 Example: A channel which bans a lot of users may actually be a healthier
channel, since they have a staff of moderators and bots.
18/28

Viewership Impacts?
 Key Question: How does abusive behaviors impact
the health of our community?
 Reduced Broadcaster RETENTION?
 Reduced viewer ENGAGEMENT?
 Lots of 3rd party UX and DS research:
 Pew 2017 Research – Online Harassment
 Riot Games and other industry research
 Talking directly to our viewers and broadcasters
 Tanya DePass: “How to Keep Safe In the
Land of Twitch”
https://www.twitch.tv/videos/174334243
19/28
https://www.polygon.com/2012/10/17/3515178/the-league-of-
legends-team-of-scientists-trying-to-cure-toxic

Moderation Workload Impact?
 Key Question: What is it like to actually use our moderation
products?
 How fast can administrators respond to reports?
 How many actions do our human channel moderators need to perform when
they moderate a chat room?
 What are the gaps in the system?
 Start by talking to our user base and performing qualitative studies to
identify these pain points, and then try to study and verify them with our
quantitative data.
20/28

Growth and Moderation Workload
 User complaint:
 As chat gets bigger and faster,
have to mod faster and a
larger % of messages
 Very busy chats = have a full
moderation staff, but
moderation efficiency goes
down
 Solution: Build moderation tools
which reduce the amount of work
which our moderators need to do
per message.
21/28
Mod Action / Message: Extra Human Mod Staff:
Moderation Efficiency vs Conversation Speed:
Chat Message/Min Chat Message/Min1 msg
100 min
10 msg
1 second

Impact Study: Chat Rules
 Intended impact: Get rid of of timeouts and bans
caused by misunderstanding of channel rules.
 A/B Test: When entering a channel for the first time,
chatters were shown control and variant:
 Chat rules: click to agree
 No chat rules
 Results: No significant impact on chat participation, and
a statistically significant reduction in timeouts and bans
for the ‘click to agree’ variant!
22/28
GOG.com’s Twitch chat rules

Toxicity: Tools
LET’S USE THESE
LEARNINGS TO BUILD
SOMETHING THAT MAKES
OUR USERS SAFER
23/28
Intervention
Measuring impact

AutoMod
 Data Product Problem: Can we help broadcasters
passively filter hate speech, bullying, and sexual
language they don’t want on their chat?
 Solution: AutoMod - automated filtering of language,
based on topic category and perceived level of risk
 Algorithm designed using a combination of statistical
learning and human qualitative review
24/28

Designing AutoMod
 Start with a pre-trained off-the-shelf ML solution
 Segments and normalizes each chat segment.
 Categorizes sentence fragments by risk topic (hate, sex, bullying, etc.) and severity
(high risk, medium risk, etc.)
 Can handle over ten languages, combos of words and emotes, misspellings, and
(important!) attempts to get around the filter.
25/28
Example:
Original: “Omg. You should killll yooorseeeeeefff.”
Parsed: [ omg ] [ {you/he/she} | should | {self harm} ]
no risk Bullying – High Risk Level

Designing AutoMod
 Making this work for Twitch:
 Compare, for sentence fragment f:
 Use Lf to flag individual expressions which were obvious false
positives or incorrectly rated.
 Chose risk thresholds for our preset options, Rule Levels 1-4
 Get it running in the field
 Initial dry run: DNC/RNC Conventions 2016
 Small closed beta to refine usability and filter accuracy.
26/28
𝐿 𝑓 ~ log
𝑁𝑓,𝑏𝑎𝑛𝑛𝑒𝑑 + 1
𝑁 𝑎𝑙𝑙,𝑏𝑎𝑛𝑛𝑒𝑑 + 1
𝑁𝑓,𝑛𝑜 𝑏𝑎𝑛 + 1
𝑁 𝑎𝑙𝑙,𝑛𝑜 𝑏𝑎𝑛 + 1
For fragment ‘f’ (and message counts Ncat):
AutoMod Risk Likelihood Lf of User Being
Banned for That Fragment
versus

Maintaining AutoMod
 Full opt-in launch of AutoMod on Dec15, 2016
 Improving Accuracy: Use Approve and Deny actions to
determine what AutoMod recommendations our users
agree and disagree with.
 L’f Factor: Surface list of recommended rule changes,
which are then vetted by our admin staff.
 Sep 2017: False positives reduced by 33% since launch!
 25% of all chat messages go through AutoMod
 Continue to develop based on performance and user
feedback...
27/28
𝐿′ 𝑓 ~ log
𝐶𝑓,𝑑𝑒𝑛𝑖𝑒𝑑 + 1
𝐶 𝑎𝑙𝑙,𝑑𝑒𝑛𝑖𝑒𝑑 + 1
𝐶𝑓,𝑎𝑝𝑝𝑟𝑜𝑣𝑒𝑑 + 1
𝐶 𝑎𝑙𝑙,𝑎𝑝𝑝𝑟𝑜𝑣𝑒𝑑 + 1
For fragment ‘f’ (and total unique
channels Ccat):

Conclusions
 Our Punchline: Quantitative analysis and qualitative research alone can’t capture
exactly what’s happening with safety in our products and community.
 Combine data science with qualitative learnings from our UX team, our admins, and
from talking to our viewers and broadcasters for better decisions
 Where we apply this:
 Extent: Figure out what signal your data is giving you about safety.
 Impact: What are the right questions we should be asking, and using what tools and
metrics?
 Tools: Using these data and questions, we can craft powerful tools for safety!
28/28

29
‘Kappa - Bob Ross Portrait’
By: twitch.tv/sohlol

Twitch TOS – Relevant Sections
 9. Prohibited Conduct
 You agree that you will comply with these Terms of Service and Twitch’s Community
Guidelines and will not:
 i. create, upload, transmit, distribute, or store any content that is inaccurate, unlawful,
infringing, defamatory, obscene, pornographic, invasive of privacy or publicity rights,
harassing, threatening, abusive, inflammatory, or otherwise objectionable;
 ii. impersonate any person or entity, falsely claim an affiliation with any person or entity, or
access the Twitch Services accounts of others without permission, forge another person’s
digital signature, misrepresent the source, identity, or content of information transmitted via
the Twitch Services, or perform any other similar fraudulent activity;
 v. defame, harass, abuse, threaten or defraud users of the Twitch Services, or collect, or
attempt to collect, personal information about users or third parties without their consent;
30

Moderation Tools and User Safety: Data-Driven Approaches at Twitch

Recommended

Recommended

More Related Content

Similar to Moderation Tools and User Safety: Data-Driven Approaches at Twitch

Similar to Moderation Tools and User Safety: Data-Driven Approaches at Twitch (20)

Recently uploaded

Recently uploaded (20)

Moderation Tools and User Safety: Data-Driven Approaches at Twitch