Game analytics: ML to the rescue!
Evgenii Tsymbalov, Data Scientist
Who we are
WebGames(“WG”) is one of Russia’s largest developers and
publishers of free-to-play games
Platforms: FB, iOS/ Android, gamesocial platforms (VK, OK,
MW, Congregate, Steam)
Daily audience of over 400K players
Data: ~80M records per day
Game analytics
Marketing analytics
In-game analytics
Churn prediction
Retargeting
Revenue prediction
User classification
A/B testing
Balance
Recommendations
User/content classification
Churn and retargeting
Churn
Retargeting
Find users who are about to stop playing
Give them bonuses
…
PROFIT!
Channels: app-to-user notifications, messages, mail
Find and support users for retargeting
Channels: traffic control
Revenue prediction
Costumer LTV (Lifetime Value) - estimate of overall
profit from the entire future relationship with a
customer.
Applications:
indicator of project healthiness;
advertising actions planning;
in-game events planning.
It is important to estimate LTV not in general (platform
of project), but for different cohorts or even every
individual player.
LTV: methodology and assumptions
Estimating LTV-100 (may vary for project).
User’s actions determined by his or her behavior in first
30 days after registration =>
30 different models, kth for users who plays k days;
Tracking only last year data
General multistage model:
Classification (going to pay or not)
Regression (revenue prediction)
Additional low-level classifiers, such as events, holidays, etc.
LTV: accuracy metrics
TA (total accuracy) = . TA = 1 for perfect
predictor.
RAE-d (relative absolute error on day d) =
. This equals to zero for
perfect predictor.
Here, - LTV-100 forecast for i-th player, – total
revenue on day d for i-th player.
TA is a main indicator for marketing department, while
RAE is widely used to compare models’ performance on
different days.
LTV models: kNN + cohorts
User classification
User classification
A/B testing
Classic approach: fixed group size, results after
full filling.
Bayesian approach: prior distribution changes over time
with test results using Bayes theorem.
Bayesian A/B Testing at VWO, Chris Stucchio, 2015
Balance and recommendations
Classic approach: game-designers with Google Spreadsheets.
Better approach: modeling.
Rule-based approach
Midgame support based on classification
Content recommendations.
Static case
Dynamic case
Balance: rule-based approach
Balance: midgame support
Content clustering
Instead of conclusion: what helps us
Instead of conclusion: what helps us
Questions?
evgenii.tsymbalov@corpwebgames.com
Вопрос из зала: конкретные цифры
Вопрос из зала: какие алгоритмы?

Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой аналитики

  • 1.
    Game analytics: MLto the rescue! Evgenii Tsymbalov, Data Scientist
  • 2.
    Who we are WebGames(“WG”)is one of Russia’s largest developers and publishers of free-to-play games Platforms: FB, iOS/ Android, gamesocial platforms (VK, OK, MW, Congregate, Steam) Daily audience of over 400K players Data: ~80M records per day
  • 3.
    Game analytics Marketing analytics In-gameanalytics Churn prediction Retargeting Revenue prediction User classification A/B testing Balance Recommendations User/content classification
  • 4.
    Churn and retargeting Churn Retargeting Findusers who are about to stop playing Give them bonuses … PROFIT! Channels: app-to-user notifications, messages, mail Find and support users for retargeting Channels: traffic control
  • 5.
    Revenue prediction Costumer LTV(Lifetime Value) - estimate of overall profit from the entire future relationship with a customer. Applications: indicator of project healthiness; advertising actions planning; in-game events planning. It is important to estimate LTV not in general (platform of project), but for different cohorts or even every individual player.
  • 6.
    LTV: methodology andassumptions Estimating LTV-100 (may vary for project). User’s actions determined by his or her behavior in first 30 days after registration => 30 different models, kth for users who plays k days; Tracking only last year data General multistage model: Classification (going to pay or not) Regression (revenue prediction) Additional low-level classifiers, such as events, holidays, etc.
  • 7.
    LTV: accuracy metrics TA(total accuracy) = . TA = 1 for perfect predictor. RAE-d (relative absolute error on day d) = . This equals to zero for perfect predictor. Here, - LTV-100 forecast for i-th player, – total revenue on day d for i-th player. TA is a main indicator for marketing department, while RAE is widely used to compare models’ performance on different days.
  • 8.
    LTV models: kNN+ cohorts
  • 9.
  • 10.
  • 11.
    A/B testing Classic approach:fixed group size, results after full filling. Bayesian approach: prior distribution changes over time with test results using Bayes theorem. Bayesian A/B Testing at VWO, Chris Stucchio, 2015
  • 12.
    Balance and recommendations Classicapproach: game-designers with Google Spreadsheets. Better approach: modeling. Rule-based approach Midgame support based on classification Content recommendations. Static case Dynamic case
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    Вопрос из зала:конкретные цифры
  • 20.
    Вопрос из зала:какие алгоритмы?