Your SlideShare is downloading. ×
0
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
How OMGPOP Uses Predictive Analytics to Drive Change
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How OMGPOP Uses Predictive Analytics to Drive Change

1,492

Published on

Please go to our meet-up site if you'd like to download the file http://bit.ly/aXDqzg - Special thanks to Andrew and Nathan Artz from OMGPOP.com for sharing their insights with the group!

Please go to our meet-up site if you'd like to download the file http://bit.ly/aXDqzg - Special thanks to Andrew and Nathan Artz from OMGPOP.com for sharing their insights with the group!

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,492
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How<br />uses<br />to<br />
  • 2. How to win at business<br />Earn X+ε per user<br />(Lifetime Value)<br />Pay X to acquire a user<br />(Cost per Acquisition)<br />Black <br />Box<br />𝑖=0𝑛𝜀𝑖<br /> <br />= WIN (where n is large, and ε&gt;0)<br />
  • 3. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
  • 4. Customer Lifetime Value<br />How much is a user worth to me over his/her lifetime?<br />CLV(C,S,R) = C * S * R<br />C: conversion to pay<br />S: average transaction size<br />R: average number of purchases over lifetime <br />
  • 5. How do we increase revenue?<br />Conversion = # paying users / # total users<br />Social Gaming sites usually get <br />1% (low) - 5% (godly)!<br />
  • 6. site<br />
  • 7. Welcome to OMGPOP<br />
  • 8. OMGPOP is a community-centric multiplayer gaming site<br /><ul><li>Real-Time Multiplayer Games
  • 9. Community Oriented
  • 10. Virtual Economy</li></li></ul><li>We sell virtual items<br />…And accept many forms of payment<br />
  • 11. We are not a research lab<br /><ul><li>We are a venture-backed startup
  • 12. Investors demand bottom-line results FAST. No credit for academic publications and citations.
  • 13. Resources are SCARCE
  • 14. We have to justify every minute spent on predictive analytics
  • 15. How long do we spend developing, testing, and measuring X feature?”
  • 16. We have weeks – not months – to show results. It needs to be immediately actionable; we make lots of assumptions</li></li></ul><li>“What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
  • 17. How do we increase conversion?<br />Our site contains MANY features <br />Chat<br />Games<br />Walls<br />Notifications<br />Surveys<br />Pictures<br />Where do we focus our efforts?<br />Which has the greatest ROI?<br />
  • 18. What causes a user to buy?<br />Our Guiding Mantra: <br />A user’s experience on the site is directly correlated with his/her probability to pay<br />P(Buy)<br />P(Buy’)<br />On Site Experience Changes<br />Before<br />After<br />
  • 19. What site experience is causing users to pay?<br />Let’s translate into analytical questions:<br />“What are indicators of paying users?”<br />“What features are unique to paying users?”<br />“What unique experience do payers have that drive them to pay?”<br />“What features separate paying users from nonpaying users?”<br />
  • 20. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
  • 21. We aggregated over 100 features<br />gender<br />age<br />site_level<br />gameplays<br />logins_count<br />play_intensity<br />login_intensity<br />cents_first_purchase<br />number_virtual_goods_purchased<br />amount_on_first_purchase<br />ingame_items_purchased<br />total_coins_spent<br />total_coins_earned<br />coin_balance<br />number_of_friends<br />total_friends_invited<br />facebook_connected<br />candystand_user<br />aim_user<br />gifts_sent<br />gifts_received<br />ip_address<br />has_mobile_number<br />has_uploaded_photo<br />signup_date<br />pay_date<br />time_to_first_purchase_roundup<br />time_to_first_purchase_round<br />profile_items_purchased<br />balloono_items_purchased<br />
  • 22. We were suspicious of our gender data<br /><ul><li>According to self-reported data, 80% of users were male.
  • 23. So we hired a 3rd party data service to validate
  • 24. And we asked every user 4 questions about their gender</li></li></ul><li>Our female users lie to us<br /><ul><li>65% of women said “No, I’m not a girl”
  • 25. 73% of women said “No, I’m not a woman”</li></li></ul><li>Our users don’t always tell the truth<br />
  • 26. We can use the gender questions to build a simple predictive model<br />Input Raw Data Set<br />Choose label<br />Choose classifier<br />Train on X% of the data<br />Test on Y% of the data<br />Remove irrelevant features<br />
  • 27. Name of the game:<br />Train a model with the highest accuracy (confidence)<br />ML<br />Features<br />Result<br />Data<br />Accuracy determined by data – want to remove data that doesn’t contribute relevant information (i.e. remove noise)<br />
  • 28. Choosing Features<br />1. Intuition to choose many possible important features<br />2. Remove features that you can’t trust<br />3. Approximate Importance of features<br />4. Train model<br />5. Re-Train model on subsets of feature list and choose features that yield highest accuracy<br />
  • 29. Problems Comparing Features <br />Between Our Users<br />Select Features Common to payers and Nonpayers<br />Keep Distributions Intact<br />Our Experience:<br />Had to compare users stats right before their first purchase (behavior on site changes after first purchase)<br />-100 Plays<br />-20 days on site<br />-Paid $20<br />-10 Plays<br />-1 days on site<br />-Didn’t pay<br />
  • 30.
  • 31. How to Format the Data for Your Model<br />
  • 32. Modeling<br />Features<br />Result<br />Data<br />
  • 33. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
  • 34. What does a Classification Model do?<br />Model ‘learns’ to classify apples and oranges<br />Classification Model<br />Labeled Apples and Oranges<br />pruning, optimize parameters, weights, etc<br />Unlabeled Fruit<br />Classification Model<br />% Chance of Being an Apple (or orange)<br />
  • 35. Applying a Predictive Model<br />Purpose<br />We are a startup, we need quick results, interpretation, and action<br />Decision Tree<br />Pros:<br />Easily Understood / interpreted<br />Calculates Quickly<br />Cons<br />Local max only (greedy)<br />Less Accuracy<br />
  • 36. Wine Data<br />
  • 37. What is a Decision Tree?<br />
  • 38. Gameplays<br />&gt; 100<br />&lt; 100<br />Payers<br />Nonpayers<br />
  • 39. Purity<br />Measure Homogeneity of the Labels<br />Degree of Homogeneity is measured through:<br />Entropy<br />Gini Index <br />others<br />= probability of occurence of class j in the sample<br />
  • 40. Decision Tree Algorithm (simplified)<br />Calculate Impurity for Original Sample (probability for each )<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />2. Calculate Information Gain for each possible attribute split<br />split table -- difference in impurity measure is called the Information Gain<br /> IG= E(Original Table) – Sum_i (n * k /n * E( Feature_table_i )<br />3. Choose the attribute split that results in the highest information gain<br />4. Remove Splitting Attribute, Recusively keep splitting on highest information gain attribute – Done when no more attributes, information gain is too tiny, or max depth of tree<br />
  • 41. Calculate Impurity for Sample<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />
  • 42. CALCULATE IMPURITY FOR EACH INDIVIDUAL FEATURE<br />No<br />Yes<br />
  • 43. Decision Tree Algorithm (simplified)<br />Calculate Entropy for Original Table(probability for each )<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />2. Take the difference in entropy between our original table and the weighted sum of the split tables -- difference in impurity measure is called the Information Gain<br /> IG= E(Original Table) – Sum_i (n * k /n * E( Feature_table_i )<br />3. Choose the attribute split that results in the highest information gain<br />4. Keep splitting on highest information gain attributes, repeat until a certain depth of tree has been reached, or until a certain lower threshold of information gain is achieved<br />
  • 44. Setting Up a Decision Tree Using RapidMiner<br />(http://archive.ics.uci.edu/ml/datasets/Wine)<br />Merlot<br />Shiraz<br />Merlot<br />Cabernet<br />Merlot<br />
  • 45. some features<br /># friends<br /># plays<br />Win percentages<br />coins earned<br />Photos Uploaded<br />Coins spent<br />Purchases of different virtual items<br /># Plays for each game<br />Fill rate for each game<br />Game Lengths<br />Facebook / Myspace / aim<br />Gifts sent / received<br />Location<br />etc<br />
  • 46. Can Use impurity for a quick ‘approximate’ ranking of feature importance for segmenting<br />The highest information gain split, the more relevant the feature is for segmenting<br />
  • 47. Ideas for what features seperated nonpaying users from paying users?<br />
  • 48. Results showed four different ‘groups’ of users<br />People who hadn’t interacted with goods or virtual currency<br />people who got just a free virtual good <br />people who bought 1-3 virtual goods and spent at least 1 virtual currency<br />People who bought 7.5 + virtual items<br />Group 1 had almost no people who spent real $$, group 4 had the most. Intuitive! The goal is then to take a smaller step and get people to interact with the virtual goods and currency from day one.<br />
  • 49. total_coins_spent<br />coin_balance<br />gameplays<br />Balloonogameplays<br />number_virtual_goods_purchased<br />SVM Weights<br />type: C-SVC (LIBSVM)<br />kernel: linear<br />
  • 50. Modeling<br />Features<br />Result<br />Data<br />
  • 51.
  • 52. “So now what do I do? How do I take action?”<br />
  • 53. Payers and Nonpayers are having different experiences!<br />Most people purchase at the START of their experience (seen in distribution of payers)<br />People who spend $$ are those who spend their virtual currency buying virtual goods<br />Extracting Insights From The Model<br />
  • 54. We want the nonpaying user to have the same experience as the paying user<br />
  • 55. Press the Button<br />
  • 56. On a website…<br />So you cant click for the user, but...<br />You control the flows, i.e you ‘direct’ people where to click, and have HUGE INFLUENCE over what users do on your site<br />Only one link? No where else to go.<br />
  • 57. We want people to buy more virtual items and spend more coins at the start of their experience…and we can direct where people click…<br />So?<br />
  • 58. On the first login, directFORCE all new users to spend coins buying virtual items!<br />
  • 59. Theory<br />Someone can easily go through the site without EVER having spent a single coin.<br />Lubricate the purchasing process<br />Habituate users to spend / buy early and often<br />Getting people to spend more coins will increase conversion<br />Forcing NOT the same as User Elected Action, But it Likens their Experiences!<br />
  • 60. A Quick Look Back<br />Extract ‘Insights’ from the model<br />See most relevant separating features<br />Bridge the gap between separating features<br />
  • 61. Data at the Helm<br />Hard to make ‘Guiding Decisions’<br />Data Inspires Confidence in decisions moving forward<br />Worse Case - learn how data changed in response to change on website, new insight!<br />
  • 62. “So I implemented this change, how can I tell if its working?”<br />
  • 63. A/B Testing <br />Show some users layout ‘A’<br />Show other users layout ‘B’<br />Measure how many people who see layout ‘A’ who do some action vs people who saw layout ‘B’<br />Choose the layout that had the highest conversion to action<br />Google Web Optimizer for HTML A/B Testing<br />
  • 64. Implementing an A/B Test<br />GROUP A (control group - no changes)<br />GROUP B (Force Buy)<br />Signup<br />Play<br />Leaves<br />Put User Into Shop, Give Coins, Popup to Buy a Virtual item<br />Signup<br />Play<br />Leaves<br />
  • 65. Now Click ‘Buy’ to buy this cool armor for your character!<br />
  • 66. Which Measurement Tells Me that My Change is Successful?<br />Choose the test group that spent more time in the Virtual item shop?<br />Choose the test group that bought more virtual items and spent more coins?<br />Choose the test group with the highest LTV!<br />
  • 67.
  • 68. Thanks!<br />

×