Studying online distribution platforms
for games through the mining of data
from the Steam platform
Ph.D. Thesis Defense
Dayi Lin
Supervisor: Prof. Ahmed E. Hassan
PC Gaming is a large market with picky users
$135 billion market 1.3 million
concurrent users
2
Gamers are extremely hard to please
Game industry is shifting from offline
distribution to online distribution
3
Traditional Software Development Life Cycle →
Faster & iterative Software Development Life Cycle
Literature review: online distribution and
software engineering practices of games
4
20 major journals & conferences
in software engineering and games
Include relevant citations
not published in target venues
Check titles and abstracts
for each publication in the last 10 years
Prior research has focused on the mining
of mobile app stores
5
(Martin et al., 2017)
Prior research has focused on the mining
of mobile app stores
6
Reviews
(Vasa et al. 2012,
Hoon et al. 2012)
Updates
(Martin et al. 2016,
McIlroy et al. 2016,
Hassan et al. 2017)
Description
(Liu et al. 2017)
User behavior
(Liu et al. 2017)
Ratings
(Tian et al. 2015,
Bavota et al. 2015,
Ruiz et al. 2016)
Results from prior studies have provided
developers with valuable insights
7
Emerging issues
Sentiment analysis
Feature prioritization
Release planning
Cross platform issues
…
Developing games is different from
developing non-game software
8
≠≠
(Pascarella et al., 2018)
Desktop
Several prior studies looked at software
engineering practices for games
9
Requirement
(Daneva 2017)
SDLC
(Politowski et al. 2016,
Kasurinen et al. 2017)
Testing
(Varvaressos et al. 2017,
Becares et al. 2017)
Data mining for
individual games
(Zimmermann et al. 2012,
Huang et al. 2013)
Architecture
(Oilsson et al. 2015)
Challenges
(Lewis et al. 2010,
Hall et al. 2011,
Washburn et al. 2016)
Most prior work on gaming platforms
focuses on social network analysis
10
(Becker et al. 2012; Blackburn et al. 2014; Sifa et al. 2015)
To fill in the gap and provide practical
suggestions to game developers
11
184M +
Active Users
10K +
Games Available
82K +
Years of Playing
Data Source: Steam Spy, Steam Store and Steam Community, up to Sep 13, 2016
12
Developers Players
We studied 4 aspects of the online distribution
for games, by mining the Steam platform
Urgent
Updates
Early
Access
User
Reviews
Gameplay
Videos
Our studies was covered by
prestigious medias in gaming industry
13
14
Understanding urgent updates of games
Developers Players
Urgent
Updates
Early
Access
User
Reviews
Gameplay
Videos
Urgent updates: fire-fighting conditions
15
R1 R2 R3 R4
……
Inefficiency New problems
Urgent Updates
The 50 most popular Steam games
have ~80% of the total concurrent users
Ranked by peak daily concurrent players on Jan 12, 2016. Data source: SteamSpy
16
Urgent Updates
R1 R2
P2
R4
……
0-day Update
R3
R1 R2 P2
…
Self-admitted Hotfix
Patch #14 HOTfix 1. -Addressed a balance-
breaking issue with one of the weapons, which
was caused by the new itemization system.
R1 R2 P2
…
Faster Off-cycle Updates
1.1.30 Update (PC)
Fixes crash related to xaudio drivers.
Three types of urgent updates
17
Urgent Updates
18
Preliminary study of the
update cycle of games
Steam 11,790
News Updates
2,672
Update Notes
Title _________
Date _________
Content _________
_________________
_________________
Update Note
Update frequency
Update consistency
Update strategy
Preliminary study
Urgent Updates
19
81% of the studied games have periods in
which they release faster than once a week
6 month 6 weeks 7 days
Non-game Game
Urgent Updates
20
Games from the same developer follow
the same update strategy
R1 R2 R3 R4
……
Build-up Candidate Update Strategy (68%)
R1 R2 R3 R5
……
Frequent Update Strategy (32%)
R4
Urgent Updates
21
Games that use a frequent update strategy
tend to have more 0-day updates
0 10 20
% of 0−day updates
30
Frequent update
Build−up candidate
Souza et al.: releasing frequently leads to a higher proportion
of patches that must be reverted.
Urgent Updates
22
Reasons for Releasing Urgent Updates
Title _________
Date _________
Content _________
_________________
_________________
Update Note
Open Coding
11 Reasons for
releasing urgent
updates
Urgent Updates
23
11 reasons for urgent updates
Frequent update Build-up candidate
Reason % Reason %
Functionality 61 Functionality 72
CrashingGame 39 RuleChange 38
Visual 31 CrashingGame 35
UserInteraction 26 Visual 35
RuleChange 21 RuleLoophole 35
RuleLoophole 20 UserInteraction 29
Performance 20 Performance 26
Content 18 Content 23
Sound 10 Sound 14
Localization 3 Localization 5
Security 2 Security 5
Urgent Updates
24
Build-up candidate: more RuleChange
Frequent update Build-up candidate
Reason % Reason %
Functionality 61 Functionality 72
CrashingGame 39 RuleChange 38
Visual 31 CrashingGame 35
UserInteraction 26 Visual 35
RuleChange 21 RuleLoophole 35
RuleLoophole 20 UserInteraction 29
Performance 20 Performance 26
Content 18 Content 23
Sound 10 Sound 14
Localization 3 Localization 5
Security 2 Security 5
“An imbalance
in the rules of a
game is a type
of issue that
requires an
immediate fix”
-- League of
Legends
Urgent Updates
25
Take-home: choice of update strategy is
associated with urgent updates
(Empirical Software Engineering, 22(4):2095-2126, 2017)
Urgent Updates
26
Understanding the early access model
Developers Players
Urgent
Updates
Early
Access
User
Reviews
Gameplay
Videos
27
and Early Access
Early connection
with community
Feedback & bugs
1st to play
Get involved as
game evolves
As a developer As a gamer
Playable
Version
Early Access
Stage
Full
Release
Early Access
28
• 400,000 sales in
the first week
• Ranks top 50 for
# of players
• 77% negative
reviews
• 12 employees
laid off
The debate of whether Early Access is a
promising development model has been raised
Early Access
We studied all 1,182 Early Access Games
(EAGs) on the Steam platform
29
1,182
Sales Info
1,564,574
User Reviews
801,128
Forum Posts
16,780
Release Notes
Characteristics
Interaction
Tolerance
What are the characteristics of
the EA model?
How do developers and players of EAGs
interact with the Steam platform?
How tolerant are players of the
quality of EAGs?
Early Access
0
10
20
30
40
50
60
70
13-Mar
13-May
13-Jul
13-Sep
13-Nov
14-Jan
14-Mar
14-May
14-Jul
14-Sep
14-Nov
15-Jan
15-Mar
15-May
15-Jul
15-Sep
15-Nov
16-Jan
The popularity of the Early Access model on
the Steam platform is growing
30
2013: 64 EAGs
2015: 434 EAGs
580% increase!
300% increase for all games
Characteristics - Interaction - Tolerance
#ofgames
Early Access
0
10
20
30
40
50
60
70
13-Mar
13-May
13-Jul
13-Sep
13-Nov
14-Jan
14-Mar
14-May
14-Jul
14-Sep
14-Nov
15-Jan
15-Mar
15-May
15-Jul
15-Sep
15-Nov
16-Jan
Current EAG Former EAG
34% of all Early Access Games have left the
Early Access stage
31Characteristics - Interaction - Tolerance
#ofgames
Only 50% have
left the Early
Access stage
Early Access
Early Access model is mostly used by
individual developers or small studios
32Characteristics - Interaction - Tolerance
88%
EAGs are labelled
by developers as
“Indie”
56%
Non-EAGs are labelled
by developers as
“Indie”
Early Access
EAGs receive a lower review rate but a higher
discussion participation rate during EA stage
33
Higher activity on the
discussion forums
for 81% of the EAGs
Lower activity of
owners posting reviews
for 65% of the EAGs
Characteristics - Interaction - Tolerance
For each EAG, we compare the period of time
during and after its EA stage
Early Access
EAGs receive a lower review rate but a higher
discussion participation rate
34
Higher activity on the
discussion forums
Lower activity of
owners posting reviews
PRO
CON
- Lower probability of negative reviews
- More concrete issues
Characteristics - Interaction - Tolerance
- Difficult to quantify sentiment
Double-edged
Early Access
Reviews tend to be more positive during the
Early Access stage
35Characteristics - Interaction - Tolerance
Higher positive
review rate
• Not correlated with
length or update
frequency during EA
stage
for 88% of the EAGs
during EA stage
•
Early Access
36
Take-home: use the early access model for
better reputation, not as a funding source.
(Empirical Software Engineering, 23(2):771-799, 2018)
Early Access
37
Understanding user reviews for games
Developers Players
Urgent
Updates
Early
Access
User
Reviews
Gameplay
Videos
Prior work on mobile app reviews has
shown the value of studying reviews
38
User Reviews
Emerging issues
Sentiment analysis
Feature prioritization
Release planning
Cross platform issues
…
Do game reviews share similar
characteristics with mobile app reviews?
39
User Reviews
We analyzed 10M reviews on the Steam
platform along four dimensions
40
Positive v.s. Negative
EAG v.s. Non-EAG
Indie v.s. Non-Indie
Free v.s. Paid
User Reviews
Preliminary Study: Characteristics of
Game Reviews
41
Steam Summer Sales 2014
How much effort
to read through
reviews?
What triggers
the growth of
# of reviews?
User Reviews
42
10 1000100
Median number of characters per reviews
Free−to−play
Non−free−to−play
Most games receive reviews with a median
length of 205 characters, or 30 words
User Reviews
64% of reviews are in English
with a grade 8 readability
43
English
Russian
German
French
Spanish
Brazilian
Polish
Simplified
Chinese
Italian
Korean
0.0
0.2
0.4
0.6
0.8
1.0
Ratioofreviewsineachlanguage
User Reviews
A sale event is strongly associated with an
increase in reviews over a new update
44
Mixed-
effect
model
User Reviews
7 date
level metrics
3 game
level metrics
# of reviews
per day
How long do players play a game before
posting a review?
45
0.1 1 10 100 1000
Hours
Negative reviews
Positive reviews
15.5 hrs
6.6 hrs
Gamers play a game for
a median of 13.5 hours
before posting a review
User Reviews
A peak in the number of reviews of free-to-play
games is observed after one hour of playing
46
“First hour
experience”
User Reviews
What are gamers talking about in reviews?
47
96 positive
96 negative
96 indie
96 non-indie
96 EAG
96 non-EAG
96 free
96 paid
472 reviews
(95% confidence level,
5% confidence interval)
User Reviews
42% of the reviews provide valuable
information to developers
48
Category
% of all
reviews
% of
positive
reviews
% of
negative
reviews
Not helpful 71 71 55
Pro 38 46 18
Con 34 29 57
Video 8 7 17
Suggestions 4 4 2
Bug 1 1 1
User Reviews
42% of the reviews provide valuable
information to developers
49
Category
% of all
reviews
% of
positive
reviews
% of
negative
reviews
Not helpful 71 71 55
Pro 38 46 18
Con 34 29 57
Bug 8 7 17
Suggestions 4 4 2
Video 1 1 1
players value a
well-designed
gameplay over
software
quality
User Reviews
50
Take-home: positive reviews and playing
hours should not be ignored by researchers
(Empirical Software Engineering, in press, 2018)
User Reviews
51
Understanding and leveraging
gameplay videos
Developers Players
Urgent
Updates
Early
Access
User
Reviews
Gameplay
Videos
Collecting bug reports is hard to automate
52
GameplayVideos
• Summary
• Steps to reproduce
• Expected results
• Actual results
• Attachments
Most important
but most difficult
Bug report
Collecting bug reports is hard to automate
53
GameplayVideos
• Summary
• Steps to reproduce
• Expected results
• Actual results
• Attachments
Bug report
General functionality
bugs of games “may only
be visible in movies”
Gameplay videos about bugs are popular
54
7,979 videos Bug report
?
GameplayVideos
Preliminary study: do naïve approaches
work for identifying bug videos?
55
Two naïve approaches:
Watch through
all videos
Keyword
search
Predicted
Bug video
Not a bug
video
Actual
Bug video
True
Positive
False
Negative
Not a bug
video
False
Positive
True
Negative
GameplayVideos
Developers can receive a median of up to
13 hours of gameplay videos per day
56
Two naïve approaches:
Watch through
all videos
Keyword
search 3.03 years accumulated
13 hours on median per day
Counter-Strike: Global Offensive
GameplayVideos
Naïve keyword matching approach
has a precision of 56.25%
57
Two naïve approaches:
Watch through
all videos
Keyword
search
False Positives:
Advertising other
bug videos
Stuffing irrelevant
keywords
GameplayVideos
Determining the likelihood that a gameplay
video is showcasing a game bug
58
Random forest
classifier
7 metadata
of gameplay
videos
10 Steam
games
4 non-
Steam
games
GameplayVideos
Our classifier achieves a mean average
precision @ 10 and @ 100 of 0.91
59
43% higher precision
than naïve keyword matching approach
Avoid 23% false positives
if not including “hack”
(Counter-Strike community slang)
GameplayVideos
Most important factors: one keyword in the
title, shorter description with less keywords
60
# of keywords matches in
YouTube title
5 10 15 20 25
Mean Decrease Accuracy
YouTube description length
# of keywords matches in
YouTube description
video length
# of YouTube tags
# of keywords matches in
YouTube tags
YouTube title length
GameplayVideos
Take-home: it is practical to automatically
identify bug videos with high precision
61
(Empirical Software Engineering, under review)
GameplayVideos
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform
Studying online distribution platforms for games through the mining of data from the Steam platform

Studying online distribution platforms for games through the mining of data from the Steam platform

  • 1.
    Studying online distributionplatforms for games through the mining of data from the Steam platform Ph.D. Thesis Defense Dayi Lin Supervisor: Prof. Ahmed E. Hassan
  • 2.
    PC Gaming isa large market with picky users $135 billion market 1.3 million concurrent users 2 Gamers are extremely hard to please
  • 3.
    Game industry isshifting from offline distribution to online distribution 3 Traditional Software Development Life Cycle → Faster & iterative Software Development Life Cycle
  • 4.
    Literature review: onlinedistribution and software engineering practices of games 4 20 major journals & conferences in software engineering and games Include relevant citations not published in target venues Check titles and abstracts for each publication in the last 10 years
  • 5.
    Prior research hasfocused on the mining of mobile app stores 5 (Martin et al., 2017)
  • 6.
    Prior research hasfocused on the mining of mobile app stores 6 Reviews (Vasa et al. 2012, Hoon et al. 2012) Updates (Martin et al. 2016, McIlroy et al. 2016, Hassan et al. 2017) Description (Liu et al. 2017) User behavior (Liu et al. 2017) Ratings (Tian et al. 2015, Bavota et al. 2015, Ruiz et al. 2016)
  • 7.
    Results from priorstudies have provided developers with valuable insights 7 Emerging issues Sentiment analysis Feature prioritization Release planning Cross platform issues …
  • 8.
    Developing games isdifferent from developing non-game software 8 ≠≠ (Pascarella et al., 2018) Desktop
  • 9.
    Several prior studieslooked at software engineering practices for games 9 Requirement (Daneva 2017) SDLC (Politowski et al. 2016, Kasurinen et al. 2017) Testing (Varvaressos et al. 2017, Becares et al. 2017) Data mining for individual games (Zimmermann et al. 2012, Huang et al. 2013) Architecture (Oilsson et al. 2015) Challenges (Lewis et al. 2010, Hall et al. 2011, Washburn et al. 2016)
  • 10.
    Most prior workon gaming platforms focuses on social network analysis 10 (Becker et al. 2012; Blackburn et al. 2014; Sifa et al. 2015)
  • 11.
    To fill inthe gap and provide practical suggestions to game developers 11 184M + Active Users 10K + Games Available 82K + Years of Playing Data Source: Steam Spy, Steam Store and Steam Community, up to Sep 13, 2016
  • 12.
    12 Developers Players We studied4 aspects of the online distribution for games, by mining the Steam platform Urgent Updates Early Access User Reviews Gameplay Videos
  • 13.
    Our studies wascovered by prestigious medias in gaming industry 13
  • 14.
    14 Understanding urgent updatesof games Developers Players Urgent Updates Early Access User Reviews Gameplay Videos
  • 15.
    Urgent updates: fire-fightingconditions 15 R1 R2 R3 R4 …… Inefficiency New problems Urgent Updates
  • 16.
    The 50 mostpopular Steam games have ~80% of the total concurrent users Ranked by peak daily concurrent players on Jan 12, 2016. Data source: SteamSpy 16 Urgent Updates
  • 17.
    R1 R2 P2 R4 …… 0-day Update R3 R1R2 P2 … Self-admitted Hotfix Patch #14 HOTfix 1. -Addressed a balance- breaking issue with one of the weapons, which was caused by the new itemization system. R1 R2 P2 … Faster Off-cycle Updates 1.1.30 Update (PC) Fixes crash related to xaudio drivers. Three types of urgent updates 17 Urgent Updates
  • 18.
    18 Preliminary study ofthe update cycle of games Steam 11,790 News Updates 2,672 Update Notes Title _________ Date _________ Content _________ _________________ _________________ Update Note Update frequency Update consistency Update strategy Preliminary study Urgent Updates
  • 19.
    19 81% of thestudied games have periods in which they release faster than once a week 6 month 6 weeks 7 days Non-game Game Urgent Updates
  • 20.
    20 Games from thesame developer follow the same update strategy R1 R2 R3 R4 …… Build-up Candidate Update Strategy (68%) R1 R2 R3 R5 …… Frequent Update Strategy (32%) R4 Urgent Updates
  • 21.
    21 Games that usea frequent update strategy tend to have more 0-day updates 0 10 20 % of 0−day updates 30 Frequent update Build−up candidate Souza et al.: releasing frequently leads to a higher proportion of patches that must be reverted. Urgent Updates
  • 22.
    22 Reasons for ReleasingUrgent Updates Title _________ Date _________ Content _________ _________________ _________________ Update Note Open Coding 11 Reasons for releasing urgent updates Urgent Updates
  • 23.
    23 11 reasons forurgent updates Frequent update Build-up candidate Reason % Reason % Functionality 61 Functionality 72 CrashingGame 39 RuleChange 38 Visual 31 CrashingGame 35 UserInteraction 26 Visual 35 RuleChange 21 RuleLoophole 35 RuleLoophole 20 UserInteraction 29 Performance 20 Performance 26 Content 18 Content 23 Sound 10 Sound 14 Localization 3 Localization 5 Security 2 Security 5 Urgent Updates
  • 24.
    24 Build-up candidate: moreRuleChange Frequent update Build-up candidate Reason % Reason % Functionality 61 Functionality 72 CrashingGame 39 RuleChange 38 Visual 31 CrashingGame 35 UserInteraction 26 Visual 35 RuleChange 21 RuleLoophole 35 RuleLoophole 20 UserInteraction 29 Performance 20 Performance 26 Content 18 Content 23 Sound 10 Sound 14 Localization 3 Localization 5 Security 2 Security 5 “An imbalance in the rules of a game is a type of issue that requires an immediate fix” -- League of Legends Urgent Updates
  • 25.
    25 Take-home: choice ofupdate strategy is associated with urgent updates (Empirical Software Engineering, 22(4):2095-2126, 2017) Urgent Updates
  • 26.
    26 Understanding the earlyaccess model Developers Players Urgent Updates Early Access User Reviews Gameplay Videos
  • 27.
    27 and Early Access Earlyconnection with community Feedback & bugs 1st to play Get involved as game evolves As a developer As a gamer Playable Version Early Access Stage Full Release Early Access
  • 28.
    28 • 400,000 salesin the first week • Ranks top 50 for # of players • 77% negative reviews • 12 employees laid off The debate of whether Early Access is a promising development model has been raised Early Access
  • 29.
    We studied all1,182 Early Access Games (EAGs) on the Steam platform 29 1,182 Sales Info 1,564,574 User Reviews 801,128 Forum Posts 16,780 Release Notes Characteristics Interaction Tolerance What are the characteristics of the EA model? How do developers and players of EAGs interact with the Steam platform? How tolerant are players of the quality of EAGs? Early Access
  • 30.
    0 10 20 30 40 50 60 70 13-Mar 13-May 13-Jul 13-Sep 13-Nov 14-Jan 14-Mar 14-May 14-Jul 14-Sep 14-Nov 15-Jan 15-Mar 15-May 15-Jul 15-Sep 15-Nov 16-Jan The popularity ofthe Early Access model on the Steam platform is growing 30 2013: 64 EAGs 2015: 434 EAGs 580% increase! 300% increase for all games Characteristics - Interaction - Tolerance #ofgames Early Access
  • 31.
    0 10 20 30 40 50 60 70 13-Mar 13-May 13-Jul 13-Sep 13-Nov 14-Jan 14-Mar 14-May 14-Jul 14-Sep 14-Nov 15-Jan 15-Mar 15-May 15-Jul 15-Sep 15-Nov 16-Jan Current EAG FormerEAG 34% of all Early Access Games have left the Early Access stage 31Characteristics - Interaction - Tolerance #ofgames Only 50% have left the Early Access stage Early Access
  • 32.
    Early Access modelis mostly used by individual developers or small studios 32Characteristics - Interaction - Tolerance 88% EAGs are labelled by developers as “Indie” 56% Non-EAGs are labelled by developers as “Indie” Early Access
  • 33.
    EAGs receive alower review rate but a higher discussion participation rate during EA stage 33 Higher activity on the discussion forums for 81% of the EAGs Lower activity of owners posting reviews for 65% of the EAGs Characteristics - Interaction - Tolerance For each EAG, we compare the period of time during and after its EA stage Early Access
  • 34.
    EAGs receive alower review rate but a higher discussion participation rate 34 Higher activity on the discussion forums Lower activity of owners posting reviews PRO CON - Lower probability of negative reviews - More concrete issues Characteristics - Interaction - Tolerance - Difficult to quantify sentiment Double-edged Early Access
  • 35.
    Reviews tend tobe more positive during the Early Access stage 35Characteristics - Interaction - Tolerance Higher positive review rate • Not correlated with length or update frequency during EA stage for 88% of the EAGs during EA stage • Early Access
  • 36.
    36 Take-home: use theearly access model for better reputation, not as a funding source. (Empirical Software Engineering, 23(2):771-799, 2018) Early Access
  • 37.
    37 Understanding user reviewsfor games Developers Players Urgent Updates Early Access User Reviews Gameplay Videos
  • 38.
    Prior work onmobile app reviews has shown the value of studying reviews 38 User Reviews Emerging issues Sentiment analysis Feature prioritization Release planning Cross platform issues …
  • 39.
    Do game reviewsshare similar characteristics with mobile app reviews? 39 User Reviews
  • 40.
    We analyzed 10Mreviews on the Steam platform along four dimensions 40 Positive v.s. Negative EAG v.s. Non-EAG Indie v.s. Non-Indie Free v.s. Paid User Reviews
  • 41.
    Preliminary Study: Characteristicsof Game Reviews 41 Steam Summer Sales 2014 How much effort to read through reviews? What triggers the growth of # of reviews? User Reviews
  • 42.
    42 10 1000100 Median numberof characters per reviews Free−to−play Non−free−to−play Most games receive reviews with a median length of 205 characters, or 30 words User Reviews
  • 43.
    64% of reviewsare in English with a grade 8 readability 43 English Russian German French Spanish Brazilian Polish Simplified Chinese Italian Korean 0.0 0.2 0.4 0.6 0.8 1.0 Ratioofreviewsineachlanguage User Reviews
  • 44.
    A sale eventis strongly associated with an increase in reviews over a new update 44 Mixed- effect model User Reviews 7 date level metrics 3 game level metrics # of reviews per day
  • 45.
    How long doplayers play a game before posting a review? 45 0.1 1 10 100 1000 Hours Negative reviews Positive reviews 15.5 hrs 6.6 hrs Gamers play a game for a median of 13.5 hours before posting a review User Reviews
  • 46.
    A peak inthe number of reviews of free-to-play games is observed after one hour of playing 46 “First hour experience” User Reviews
  • 47.
    What are gamerstalking about in reviews? 47 96 positive 96 negative 96 indie 96 non-indie 96 EAG 96 non-EAG 96 free 96 paid 472 reviews (95% confidence level, 5% confidence interval) User Reviews
  • 48.
    42% of thereviews provide valuable information to developers 48 Category % of all reviews % of positive reviews % of negative reviews Not helpful 71 71 55 Pro 38 46 18 Con 34 29 57 Video 8 7 17 Suggestions 4 4 2 Bug 1 1 1 User Reviews
  • 49.
    42% of thereviews provide valuable information to developers 49 Category % of all reviews % of positive reviews % of negative reviews Not helpful 71 71 55 Pro 38 46 18 Con 34 29 57 Bug 8 7 17 Suggestions 4 4 2 Video 1 1 1 players value a well-designed gameplay over software quality User Reviews
  • 50.
    50 Take-home: positive reviewsand playing hours should not be ignored by researchers (Empirical Software Engineering, in press, 2018) User Reviews
  • 51.
    51 Understanding and leveraging gameplayvideos Developers Players Urgent Updates Early Access User Reviews Gameplay Videos
  • 52.
    Collecting bug reportsis hard to automate 52 GameplayVideos • Summary • Steps to reproduce • Expected results • Actual results • Attachments Most important but most difficult Bug report
  • 53.
    Collecting bug reportsis hard to automate 53 GameplayVideos • Summary • Steps to reproduce • Expected results • Actual results • Attachments Bug report General functionality bugs of games “may only be visible in movies”
  • 54.
    Gameplay videos aboutbugs are popular 54 7,979 videos Bug report ? GameplayVideos
  • 55.
    Preliminary study: donaïve approaches work for identifying bug videos? 55 Two naïve approaches: Watch through all videos Keyword search Predicted Bug video Not a bug video Actual Bug video True Positive False Negative Not a bug video False Positive True Negative GameplayVideos
  • 56.
    Developers can receivea median of up to 13 hours of gameplay videos per day 56 Two naïve approaches: Watch through all videos Keyword search 3.03 years accumulated 13 hours on median per day Counter-Strike: Global Offensive GameplayVideos
  • 57.
    Naïve keyword matchingapproach has a precision of 56.25% 57 Two naïve approaches: Watch through all videos Keyword search False Positives: Advertising other bug videos Stuffing irrelevant keywords GameplayVideos
  • 58.
    Determining the likelihoodthat a gameplay video is showcasing a game bug 58 Random forest classifier 7 metadata of gameplay videos 10 Steam games 4 non- Steam games GameplayVideos
  • 59.
    Our classifier achievesa mean average precision @ 10 and @ 100 of 0.91 59 43% higher precision than naïve keyword matching approach Avoid 23% false positives if not including “hack” (Counter-Strike community slang) GameplayVideos
  • 60.
    Most important factors:one keyword in the title, shorter description with less keywords 60 # of keywords matches in YouTube title 5 10 15 20 25 Mean Decrease Accuracy YouTube description length # of keywords matches in YouTube description video length # of YouTube tags # of keywords matches in YouTube tags YouTube title length GameplayVideos
  • 61.
    Take-home: it ispractical to automatically identify bug videos with high precision 61 (Empirical Software Engineering, under review) GameplayVideos