Your SlideShare is downloading.
×

×

Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- Lecture 2: Job Opportunities by Marina Santini 153 views
- Mash-Up Personal Learning Environments by fridolin.wild 1748 views
- Digital Trails Dave King 1 5 10... by Dave King 991 views
- Brief Lecture on Text Mining and So... by Deolu Adeleye 169 views
- Sas Online Training by Svrao Vasu 624 views
- An Evaluation of Feature Selection ... by journalsats 157 views
- Financialisation of oil markets by Eric Tham 110 views
- Avertia by avertia 387 views
- Pekka puolakka sorainen by ECR Community 758 views
- Whatareaffiliates 091218070854-phpa... by Gamele Ventures 868 views
- Watertheisraeliexperience 123853013... by Gamele Ventures 828 views
- Book as api hugh mc guire and ali... by Alistair Croll 5223 views

1,492

Published on

Please go to our meet-up site if you'd like to download the file http://bit.ly/aXDqzg - Special thanks to Andrew and Nathan Artz from OMGPOP.com for sharing their insights with the group!

Please go to our meet-up site if you'd like to download the file http://bit.ly/aXDqzg - Special thanks to Andrew and Nathan Artz from OMGPOP.com for sharing their insights with the group!

No Downloads

Total Views

1,492

On Slideshare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

0

Comments

0

Likes

2

No embeds

No notes for slide

- 1. How<br />uses<br />to<br />
- 2. How to win at business<br />Earn X+ε per user<br />(Lifetime Value)<br />Pay X to acquire a user<br />(Cost per Acquisition)<br />Black <br />Box<br />𝑖=0𝑛𝜀𝑖<br /> <br />= WIN (where n is large, and ε>0)<br />
- 3. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
- 4. Customer Lifetime Value<br />How much is a user worth to me over his/her lifetime?<br />CLV(C,S,R) = C * S * R<br />C: conversion to pay<br />S: average transaction size<br />R: average number of purchases over lifetime <br />
- 5. How do we increase revenue?<br />Conversion = # paying users / # total users<br />Social Gaming sites usually get <br />1% (low) - 5% (godly)!<br />
- 6. site<br />
- 7. Welcome to OMGPOP<br />
- 8. OMGPOP is a community-centric multiplayer gaming site<br /><ul><li>Real-Time Multiplayer Games
- 9. Community Oriented
- 10. Virtual Economy</li></li></ul><li>We sell virtual items<br />…And accept many forms of payment<br />
- 11. We are not a research lab<br /><ul><li>We are a venture-backed startup
- 12. Investors demand bottom-line results FAST. No credit for academic publications and citations.
- 13. Resources are SCARCE
- 14. We have to justify every minute spent on predictive analytics
- 15. How long do we spend developing, testing, and measuring X feature?”
- 16. We have weeks – not months – to show results. It needs to be immediately actionable; we make lots of assumptions</li></li></ul><li>“What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
- 17. How do we increase conversion?<br />Our site contains MANY features <br />Chat<br />Games<br />Walls<br />Notifications<br />Surveys<br />Pictures<br />Where do we focus our efforts?<br />Which has the greatest ROI?<br />
- 18. What causes a user to buy?<br />Our Guiding Mantra: <br />A user’s experience on the site is directly correlated with his/her probability to pay<br />P(Buy)<br />P(Buy’)<br />On Site Experience Changes<br />Before<br />After<br />
- 19. What site experience is causing users to pay?<br />Let’s translate into analytical questions:<br />“What are indicators of paying users?”<br />“What features are unique to paying users?”<br />“What unique experience do payers have that drive them to pay?”<br />“What features separate paying users from nonpaying users?”<br />
- 20. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
- 21. We aggregated over 100 features<br />gender<br />age<br />site_level<br />gameplays<br />logins_count<br />play_intensity<br />login_intensity<br />cents_first_purchase<br />number_virtual_goods_purchased<br />amount_on_first_purchase<br />ingame_items_purchased<br />total_coins_spent<br />total_coins_earned<br />coin_balance<br />number_of_friends<br />total_friends_invited<br />facebook_connected<br />candystand_user<br />aim_user<br />gifts_sent<br />gifts_received<br />ip_address<br />has_mobile_number<br />has_uploaded_photo<br />signup_date<br />pay_date<br />time_to_first_purchase_roundup<br />time_to_first_purchase_round<br />profile_items_purchased<br />balloono_items_purchased<br />
- 22. We were suspicious of our gender data<br /><ul><li>According to self-reported data, 80% of users were male.
- 23. So we hired a 3rd party data service to validate
- 24. And we asked every user 4 questions about their gender</li></li></ul><li>Our female users lie to us<br /><ul><li>65% of women said “No, I’m not a girl”
- 25. 73% of women said “No, I’m not a woman”</li></li></ul><li>Our users don’t always tell the truth<br />
- 26. We can use the gender questions to build a simple predictive model<br />Input Raw Data Set<br />Choose label<br />Choose classifier<br />Train on X% of the data<br />Test on Y% of the data<br />Remove irrelevant features<br />
- 27. Name of the game:<br />Train a model with the highest accuracy (confidence)<br />ML<br />Features<br />Result<br />Data<br />Accuracy determined by data – want to remove data that doesn’t contribute relevant information (i.e. remove noise)<br />
- 28. Choosing Features<br />1. Intuition to choose many possible important features<br />2. Remove features that you can’t trust<br />3. Approximate Importance of features<br />4. Train model<br />5. Re-Train model on subsets of feature list and choose features that yield highest accuracy<br />
- 29. Problems Comparing Features <br />Between Our Users<br />Select Features Common to payers and Nonpayers<br />Keep Distributions Intact<br />Our Experience:<br />Had to compare users stats right before their first purchase (behavior on site changes after first purchase)<br />-100 Plays<br />-20 days on site<br />-Paid $20<br />-10 Plays<br />-1 days on site<br />-Didn’t pay<br />
- 30.
- 31. How to Format the Data for Your Model<br />
- 32. Modeling<br />Features<br />Result<br />Data<br />
- 33. “What am I investigating?”<br />“Where do I start?”<br />“What data do I use?”<br />“How do I model my data?”<br />“What is the data telling me?”<br />“What do I do with my new insights?”<br />“How do I know my insights are working?” <br />
- 34. What does a Classification Model do?<br />Model ‘learns’ to classify apples and oranges<br />Classification Model<br />Labeled Apples and Oranges<br />pruning, optimize parameters, weights, etc<br />Unlabeled Fruit<br />Classification Model<br />% Chance of Being an Apple (or orange)<br />
- 35. Applying a Predictive Model<br />Purpose<br />We are a startup, we need quick results, interpretation, and action<br />Decision Tree<br />Pros:<br />Easily Understood / interpreted<br />Calculates Quickly<br />Cons<br />Local max only (greedy)<br />Less Accuracy<br />
- 36. Wine Data<br />
- 37. What is a Decision Tree?<br />
- 38. Gameplays<br />> 100<br />< 100<br />Payers<br />Nonpayers<br />
- 39. Purity<br />Measure Homogeneity of the Labels<br />Degree of Homogeneity is measured through:<br />Entropy<br />Gini Index <br />others<br />= probability of occurence of class j in the sample<br />
- 40. Decision Tree Algorithm (simplified)<br />Calculate Impurity for Original Sample (probability for each )<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />2. Calculate Information Gain for each possible attribute split<br />split table -- difference in impurity measure is called the Information Gain<br /> IG= E(Original Table) – Sum_i (n * k /n * E( Feature_table_i )<br />3. Choose the attribute split that results in the highest information gain<br />4. Remove Splitting Attribute, Recusively keep splitting on highest information gain attribute – Done when no more attributes, information gain is too tiny, or max depth of tree<br />
- 41. Calculate Impurity for Sample<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />
- 42. CALCULATE IMPURITY FOR EACH INDIVIDUAL FEATURE<br />No<br />Yes<br />
- 43. Decision Tree Algorithm (simplified)<br />Calculate Entropy for Original Table(probability for each )<br />P(Payer) = P, P(nonpayer) = N, use relative frequencies<br />Entropy: -P*log(P) + -N*log(N)<br />**Entropy of a single label is zero ( if 1 class C, P(C) =1, thus log(1) = 0)<br />2. Take the difference in entropy between our original table and the weighted sum of the split tables -- difference in impurity measure is called the Information Gain<br /> IG= E(Original Table) – Sum_i (n * k /n * E( Feature_table_i )<br />3. Choose the attribute split that results in the highest information gain<br />4. Keep splitting on highest information gain attributes, repeat until a certain depth of tree has been reached, or until a certain lower threshold of information gain is achieved<br />
- 44. Setting Up a Decision Tree Using RapidMiner<br />(http://archive.ics.uci.edu/ml/datasets/Wine)<br />Merlot<br />Shiraz<br />Merlot<br />Cabernet<br />Merlot<br />
- 45. some features<br /># friends<br /># plays<br />Win percentages<br />coins earned<br />Photos Uploaded<br />Coins spent<br />Purchases of different virtual items<br /># Plays for each game<br />Fill rate for each game<br />Game Lengths<br />Facebook / Myspace / aim<br />Gifts sent / received<br />Location<br />etc<br />
- 46. Can Use impurity for a quick ‘approximate’ ranking of feature importance for segmenting<br />The highest information gain split, the more relevant the feature is for segmenting<br />
- 47. Ideas for what features seperated nonpaying users from paying users?<br />
- 48. Results showed four different ‘groups’ of users<br />People who hadn’t interacted with goods or virtual currency<br />people who got just a free virtual good <br />people who bought 1-3 virtual goods and spent at least 1 virtual currency<br />People who bought 7.5 + virtual items<br />Group 1 had almost no people who spent real $$, group 4 had the most. Intuitive! The goal is then to take a smaller step and get people to interact with the virtual goods and currency from day one.<br />
- 49. total_coins_spent<br />coin_balance<br />gameplays<br />Balloonogameplays<br />number_virtual_goods_purchased<br />SVM Weights<br />type: C-SVC (LIBSVM)<br />kernel: linear<br />
- 50. Modeling<br />Features<br />Result<br />Data<br />
- 51.
- 52. “So now what do I do? How do I take action?”<br />
- 53. Payers and Nonpayers are having different experiences!<br />Most people purchase at the START of their experience (seen in distribution of payers)<br />People who spend $$ are those who spend their virtual currency buying virtual goods<br />Extracting Insights From The Model<br />
- 54. We want the nonpaying user to have the same experience as the paying user<br />
- 55. Press the Button<br />
- 56. On a website…<br />So you cant click for the user, but...<br />You control the flows, i.e you ‘direct’ people where to click, and have HUGE INFLUENCE over what users do on your site<br />Only one link? No where else to go.<br />
- 57. We want people to buy more virtual items and spend more coins at the start of their experience…and we can direct where people click…<br />So?<br />
- 58. On the first login, directFORCE all new users to spend coins buying virtual items!<br />
- 59. Theory<br />Someone can easily go through the site without EVER having spent a single coin.<br />Lubricate the purchasing process<br />Habituate users to spend / buy early and often<br />Getting people to spend more coins will increase conversion<br />Forcing NOT the same as User Elected Action, But it Likens their Experiences!<br />
- 60. A Quick Look Back<br />Extract ‘Insights’ from the model<br />See most relevant separating features<br />Bridge the gap between separating features<br />
- 61. Data at the Helm<br />Hard to make ‘Guiding Decisions’<br />Data Inspires Confidence in decisions moving forward<br />Worse Case - learn how data changed in response to change on website, new insight!<br />
- 62. “So I implemented this change, how can I tell if its working?”<br />
- 63. A/B Testing <br />Show some users layout ‘A’<br />Show other users layout ‘B’<br />Measure how many people who see layout ‘A’ who do some action vs people who saw layout ‘B’<br />Choose the layout that had the highest conversion to action<br />Google Web Optimizer for HTML A/B Testing<br />
- 64. Implementing an A/B Test<br />GROUP A (control group - no changes)<br />GROUP B (Force Buy)<br />Signup<br />Play<br />Leaves<br />Put User Into Shop, Give Coins, Popup to Buy a Virtual item<br />Signup<br />Play<br />Leaves<br />
- 65. Now Click ‘Buy’ to buy this cool armor for your character!<br />
- 66. Which Measurement Tells Me that My Change is Successful?<br />Choose the test group that spent more time in the Virtual item shop?<br />Choose the test group that bought more virtual items and spent more coins?<br />Choose the test group with the highest LTV!<br />
- 67.
- 68. Thanks!<br />

Be the first to comment