expressiveintelligencestudio
Mining the Madden Experience
Applying Machine Learning to Telemetry
Ben Weber
UC Santa Cruz
bweber@soe.ucsc.edu
Michael John
Electronic Arts
mjohn@ea.com
expressiveintelligencestudio UC Santa Cruz
Madden NFL 11
expressiveintelligencestudio UC Santa Cruz
Madden 2011 Questions
 What gameplay features impact player
retention?
 What are optimal win rates for retention?
expressiveintelligencestudio UC Santa Cruz
Our Problem
 How do we identify the relation between
gameplay features and retention?
Gameplay
Features ? ? ? Player
Retention
expressiveintelligencestudio UC Santa Cruz
Our Solution
 Use machine learning to build models of
player behavior
 Analyze generated models to identify
influential gameplay elements
expressiveintelligencestudio UC Santa Cruz
What is Machine Learning?
 Machine Learning (ML) is branch of AI that
uses algorithms to extract patterns from
empirical data
 ML is widely used for prediction and
forecasting
expressiveintelligencestudio UC Santa Cruz
What is a Model?
 A function that maps input variables to a
predicted value
 Regression models predict a continuous value
 Different ML algorithms generate different
types of models
expressiveintelligencestudio UC Santa Cruz
What can a Model tell Us?
 Model analysis can identify the most
influential gameplay features
Testing
Data
Model Predictions
Feature
Tweaking
Analyst
expressiveintelligencestudio UC Santa Cruz
How We Applied ML
Testing
Data
Models
Predicted
number of
games played
Feature
Tweaking
Analyst
Training
Data
ML
Algorithms
Madden Players
expressiveintelligencestudio UC Santa Cruz
Our Workflow
Madden
Gamecast
data
Weka
Java Parser
(ETL)
expressiveintelligencestudio UC Santa Cruz
Madden 2011 Gamecast Dataset
 Gamecast telemetry
 Play-by-play summaries
 Xbox 360 players
 August 10th – November 1st
 350 GB
 Sampled 25,000 players
expressiveintelligencestudio UC Santa Cruz
Extract-Transform-Load (ETL)
 Parse play-by-play data
 Convert to feature vector representation
 Export to ARFF format
expressiveintelligencestudio UC Santa Cruz
ETL Workflow
Play-by-Play
Data
User DB
Madden
Gamecast
data
ARFF
Files
Parser
(Java)
Feature
Encoder
(Java)
expressiveintelligencestudio UC Santa Cruz
Gameplay Features
 Each player’s behavior is encoded as the following
features (46 total):
 Game modes
 Usage
 Win rates
 Performance metrics
 Turnovers
 Gain
 End conditions
 Completions
 Peer quits
 Feature usage
 Gameflow
 Scouting
 Audibles
 Special moves
 Play Preference
 Running
 Play Diversity
expressiveintelligencestudio UC Santa Cruz
Weka Toolkit
expressiveintelligencestudio UC Santa Cruz
Predicting the Number of Games Played
0
50
100
150
200
250
0 50 100 150 200 250
ActualGamesPlayed
Predicted Games Played
Correlation Coefficient: 0.88
expressiveintelligencestudio UC Santa Cruz
Feature Impact on Number of Games Played
 How does tweaking a single feature impact retention?
0
10
20
30
40
50
60
70
0 0.2 0.4 0.6 0.8 1
PredictedNumberofGamesPlayed
Value of tweaked Feature
Peer Quit Ratio
Play Diversity
Actions Per Play
Sacks Allowed
Online Franchise Games
expressiveintelligencestudio UC Santa Cruz
Most Influential Features
 The following features were identified as the most influential
in predicting player retention
Feature Impact
Play Diversity Negative
Online Franchise Wins Positive
Running Plays Positive
Sacks Made Positive
Actions per Play Positive
Interceptions Caught Positive
Sacks Allowed Negative
Peer Quit Ratio Negative
CorrelationStrength
expressiveintelligencestudio UC Santa Cruz
Predicted Number of Games for Different Win Rates
0
5
10
15
20
25
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
PredictedNumberofGames
Win Rate
PlayNow
Ranked
Unranked
OTP
Superstar
Franchise
Online Franchise
Ultimate Team
expressiveintelligencestudio UC Santa Cruz
What We Learned
 Simplify playbooks
 Players presented with a large variety of plays have
lower retention and less success
 Clearly present the controls
 Knowledge of controls had a larger impact than
winning on player retention
 Provide the correct challenge
 Multiplayer matches should be as even as possible,
while single player should greatly favor the player
expressiveintelligencestudio UC Santa Cruz
Project Impact
 Play selection redesign
expressiveintelligencestudio UC Santa Cruz
Takeaways
 Machine Learning enables deep analysis of
Big Data
 Machine Learning is versatile
 There are open tools
expressiveintelligencestudio UC Santa Cruz
Questions?
 Ben Weber
 UC Santa Cruz
 bweber@soe.ucsc.edu
 Michael John
 Electronic Arts
 mjohn@ea.com

Mining the Madden Experience

  • 1.
    expressiveintelligencestudio Mining the MaddenExperience Applying Machine Learning to Telemetry Ben Weber UC Santa Cruz bweber@soe.ucsc.edu Michael John Electronic Arts mjohn@ea.com
  • 2.
  • 3.
    expressiveintelligencestudio UC SantaCruz Madden 2011 Questions  What gameplay features impact player retention?  What are optimal win rates for retention?
  • 4.
    expressiveintelligencestudio UC SantaCruz Our Problem  How do we identify the relation between gameplay features and retention? Gameplay Features ? ? ? Player Retention
  • 5.
    expressiveintelligencestudio UC SantaCruz Our Solution  Use machine learning to build models of player behavior  Analyze generated models to identify influential gameplay elements
  • 6.
    expressiveintelligencestudio UC SantaCruz What is Machine Learning?  Machine Learning (ML) is branch of AI that uses algorithms to extract patterns from empirical data  ML is widely used for prediction and forecasting
  • 7.
    expressiveintelligencestudio UC SantaCruz What is a Model?  A function that maps input variables to a predicted value  Regression models predict a continuous value  Different ML algorithms generate different types of models
  • 8.
    expressiveintelligencestudio UC SantaCruz What can a Model tell Us?  Model analysis can identify the most influential gameplay features Testing Data Model Predictions Feature Tweaking Analyst
  • 9.
    expressiveintelligencestudio UC SantaCruz How We Applied ML Testing Data Models Predicted number of games played Feature Tweaking Analyst Training Data ML Algorithms Madden Players
  • 10.
    expressiveintelligencestudio UC SantaCruz Our Workflow Madden Gamecast data Weka Java Parser (ETL)
  • 11.
    expressiveintelligencestudio UC SantaCruz Madden 2011 Gamecast Dataset  Gamecast telemetry  Play-by-play summaries  Xbox 360 players  August 10th – November 1st  350 GB  Sampled 25,000 players
  • 12.
    expressiveintelligencestudio UC SantaCruz Extract-Transform-Load (ETL)  Parse play-by-play data  Convert to feature vector representation  Export to ARFF format
  • 13.
    expressiveintelligencestudio UC SantaCruz ETL Workflow Play-by-Play Data User DB Madden Gamecast data ARFF Files Parser (Java) Feature Encoder (Java)
  • 14.
    expressiveintelligencestudio UC SantaCruz Gameplay Features  Each player’s behavior is encoded as the following features (46 total):  Game modes  Usage  Win rates  Performance metrics  Turnovers  Gain  End conditions  Completions  Peer quits  Feature usage  Gameflow  Scouting  Audibles  Special moves  Play Preference  Running  Play Diversity
  • 15.
  • 16.
    expressiveintelligencestudio UC SantaCruz Predicting the Number of Games Played 0 50 100 150 200 250 0 50 100 150 200 250 ActualGamesPlayed Predicted Games Played Correlation Coefficient: 0.88
  • 17.
    expressiveintelligencestudio UC SantaCruz Feature Impact on Number of Games Played  How does tweaking a single feature impact retention? 0 10 20 30 40 50 60 70 0 0.2 0.4 0.6 0.8 1 PredictedNumberofGamesPlayed Value of tweaked Feature Peer Quit Ratio Play Diversity Actions Per Play Sacks Allowed Online Franchise Games
  • 18.
    expressiveintelligencestudio UC SantaCruz Most Influential Features  The following features were identified as the most influential in predicting player retention Feature Impact Play Diversity Negative Online Franchise Wins Positive Running Plays Positive Sacks Made Positive Actions per Play Positive Interceptions Caught Positive Sacks Allowed Negative Peer Quit Ratio Negative CorrelationStrength
  • 19.
    expressiveintelligencestudio UC SantaCruz Predicted Number of Games for Different Win Rates 0 5 10 15 20 25 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% PredictedNumberofGames Win Rate PlayNow Ranked Unranked OTP Superstar Franchise Online Franchise Ultimate Team
  • 20.
    expressiveintelligencestudio UC SantaCruz What We Learned  Simplify playbooks  Players presented with a large variety of plays have lower retention and less success  Clearly present the controls  Knowledge of controls had a larger impact than winning on player retention  Provide the correct challenge  Multiplayer matches should be as even as possible, while single player should greatly favor the player
  • 21.
    expressiveintelligencestudio UC SantaCruz Project Impact  Play selection redesign
  • 22.
    expressiveintelligencestudio UC SantaCruz Takeaways  Machine Learning enables deep analysis of Big Data  Machine Learning is versatile  There are open tools
  • 23.
    expressiveintelligencestudio UC SantaCruz Questions?  Ben Weber  UC Santa Cruz  bweber@soe.ucsc.edu  Michael John  Electronic Arts  mjohn@ea.com