Linguistic Analysis of Toxic Behavior in an Online Video Game
Linguistic Analysis of Toxic Behavior
in an Online Video Game
Haewoon Kwak* Jeremy Blackburn+
*Qatar Computing Research Institute
+Telefonica Research
EGG (Exploration on Games and Gamers) workshop
November 10th 2014
Cyberbullying
is the use of information technology to
repeatedly harm or harass other people in
a deliberate manner
http://en.wikipedia.org/wiki/Cyberbullying
4
Easy solution: player reports
Additional effort is required to check whether
the report is true or not
Data collected from three servers
EUW NA KR All
Reported players 649,419 590,311 220,614 1,460,344
Matches 2,841,906 2,107,522 1,066,618 6,016,046
Player reports 5,559,968 3,441,557 1,893,433 10,898,958
* KR Tribunal starts from November 2012, while other two Tribunals start from May 2011.
10
Our previous work on this dataset
• “STFU NOOB! Predicting crowdsourced decisions on
toxic behavior in online games” (WWW’14)
• “Exploring Cyberbullying and Other Toxic Behavior in
Team Competition Online Games” (under review)
11
• 3.139 uni-grams (toxic players) vs. 2.732 uni-grams (typical
players) are used per message.
• However, typical players send 38% more messages than
toxic players per game.
16
Longer messages of toxic players
Top 1,000 uni- and bi-grams of typical and toxic players
• 867 uni-grams and 748 bi-grams are common!
• (1000-867) = 133 uni-grams and (1000-748) = 252 bi-grams
are exclusively used by toxic players
• We call them toxic uni- and bi-grams.
• Then, what are toxic uni- and bi-grams?
21
Three different temporal patterns of bi-grams
(based on when its peak comes)
27
early-bigrams mid-bigrams late-bigrams
80% of toxic bigrams are late-bigrams
• Most of chat-related toxic playing occur at the late stage
of the game!
• Verbal abuse is most likely a response to losing a game
• Through manual inspection of bi-grams containing ‘bot’,
• early-bigrams is non-aggressive,
• mid-bigrams are cursing,
• and the late-bigrams are blaming.
28
Different temporal patterns of common words
• For toxic and typical players, we extract top 30 uni-grams
at each time (0-100)
• We get unique 80 uni-grams for toxic players and 91 uni-grams
for typical players (top 30 uni-grams are stable)
• We compute the normalized time of last use by toxic
players and normal players, respectively.
• Finally, we compute the difference of the last used time
between toxic and normal players for common uni-grams.
29
Toxic players behave the same as normal
players during the early stage of the match!
• At some point they change their behavior like a phase
transition
• Utter neither apologies, praise, …etc.
• Stop strategic communications
✓We show not just how different toxic players are, but
when they become different as well
35
• Chat are not uniform during the match
• Discriminative uni- and bi-grams used by typical and toxic
players as signatures of them
• Most of toxic bi-grams are found at the end of the game
• Toxic players express their toxicity at some point, while
they behave the same as typical players in early game
• Implication: help to develop a pre-warning system to
detect toxic playing
36
Summary - our findings