Meaningful Insights From Raw Metrics Virtual Worlds and Other Business Applications
Extracting Behavioral Insights from Virtual Worlds <ul><li>Social science backgrounds </li></ul><ul><li>Online surveys of ...
World of Warcraft PARC  |
Life in WoW <ul><li>Basic Game-Play </li></ul><ul><ul><li>Kill Monsters > Get Loot > Buy Better Weapons > Kill Bigger Mons...
Digital Panopticons <ul><li>Engaged players make Character-Revealing Decisions </li></ul><ul><ul><li>Time investment: 20 h...
“ In those days a decree went out from Caesar Augustus that all the world should be registered…” <ul><li>PlayOn 1.0: the C...
Mapping Social Networks <ul><li>Simple heuristic </li></ul><ul><ul><li>Same zone, same time, same guild </li></ul></ul><ul...
Small is Beautiful <ul><li>Limits to collective action online </li></ul><ul><ul><li>New  “online Dunbar number” of 35? </l...
From Analysis to Action: The Social Dashboard PARC  |  Tools to help scientists, game producers, and players better unders...
From Analysis to Action: The Social Dashboard PARC  |  Tools to help scientists, game producers, and players better unders...
From Analysis to Action: The Social Dashboard PARC  |  Tools to help scientists, game producers, and players better unders...
On the Internet, Nobody Knows You ’re a Dog… Or Do They? <ul><li>PlayOn 2.0: exploring real world and virtual world linkag...
Our Approach <ul><li>Start with survey component </li></ul><ul><ul><li>Ask for in-game characters </li></ul></ul><ul><ul><...
Reliable, Large Scale In-Game Data Collection PARC  |  PlayOn 1.0 PlayOn 2.0 Coverage Entire population on 5 US WoW server...
A New Data Source: The Armory <ul><li>Interactive Web app displaying character stats </li></ul><ul><ul><li>Range from amou...
RW + VW Variables <ul><li>PlayOn 1.0: VW -> VW </li></ul><ul><ul><li>Social Network Metrics -> Guild Survival </li></ul></...
Deluge of Variables <ul><li>“ All the variables” is a dangerous request </li></ul><ul><ul><li>Opposite situation of PlayOn...
The Alts Problem <ul><li>Players have main characters and alternate characters </li></ul><ul><ul><li># of alternate charac...
Normalization Strategies <ul><li>Static character attributes (e.g., Character Gender) </li></ul><ul><ul><li>Normalize agai...
Normalization Strategies <ul><li>Not possible to normalize and highly dependent on character level </li></ul><ul><ul><li>E...
Gender-Bending PARC  |
Hugs PARC  |
Role Preference PARC  |
Association Rule Mining & HotSpot <ul><li>Association Rule Mining </li></ul><ul><ul><li>Find item sets that occur with hig...
HotSpot Rules (Gender) PARC  |  Male Female Condition 1 Condition 2 Precision Recall % of Male Chars > 30% 94% 81% % of Pv...
Rules Classifier <ul><li>Predicting the entire class </li></ul><ul><ul><li>JRip algorithm in Weka (rule learner) </li></ul...
Region/Personality Findings <ul><li>Agreeableness (Trusting / Friendly) </li></ul><ul><ul><li>High: More hugs, more cheers...
Business Application: Predictive  Social  Analytics (PSA) <ul><li>The problem:  effective online marketing depends on a br...
From Vision to Reality <ul><li>Validation:  is it even feasible to infer socio-psychological traits from online behaviors?...
From Vision to Reality PARC  |
Upcoming SlideShare
Loading in...5
×

Meaningful Insights From Raw Metrics: Virtual Worlds and Other Business Applications

1,729

Published on

Presentation at O'Reilly Strata Conference: http://strataconf.com/strata2011/public/schedule/detail/17009

Virtual worlds are a goldmine of untapped behavioral data with insights that can be applied to many online social systems, as well as to the physical world.

But unlike the physical world (where it is obtrusive and cost-prohibitive to follow distributed users around with video cameras and sensors), virtual worlds come readily instrumented. Anything a user says or does – including how often and in what ways they interact with other users and objects in their virtual environment – can be tracked over an extended period of time.

In this presentation, PARC social scientists will share findings and methods (e.g., customized scripts) they developed to extract behavioral data from online games. While we will use the example of World of Warcraft, a massively popular online multiplayer game that appeals to a broad demographic (and has an average user age of 30), our data collection/analysis methods have been applied to other virtual environments as well.

More importantly, we will discuss how we converted and processed raw behavioral metrics into meaningful psychological variables that can be applied to a broad spectrum of business applications and segments. Other questions we will address include: What are some of the unique data collection challenges in virtual environments? What are the pitfalls and advantages of large-scale data sets, real-time data monitoring, and more? How can one extrapolate insights from low-incident events to broader samples or domains?

Meanwhile, our other goals in this work (which is partially funded by the U.S. government) were to examine: whether behaviors in virtual worlds can be used to predict a user’s demographic and personality; which cues are most predictive; and how well can these variables predict real world behaviors? The process we developed and our findings can be used to create practical, actionable tools for automated segmentation, targeted marketing, and other business intelligence.

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,729
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Meaningful Insights From Raw Metrics: Virtual Worlds and Other Business Applications

  1. 1. Meaningful Insights From Raw Metrics Virtual Worlds and Other Business Applications
  2. 2. Extracting Behavioral Insights from Virtual Worlds <ul><li>Social science backgrounds </li></ul><ul><li>Online surveys of online gamers </li></ul><ul><ul><li>> 50,000 respondents </li></ul></ul><ul><ul><li>Factor analytic model of player motivations </li></ul></ul><ul><li>Psychological experiments in Immersive Virtual Reality </li></ul><ul><ul><li>Do people in taller avatars negotiate more aggressively? </li></ul></ul><ul><li>Behavioral data in Virtual Worlds </li></ul><ul><ul><li>Collecting the data </li></ul></ul><ul><ul><li>Analytic obstacles </li></ul></ul><ul><ul><li>Some findings </li></ul></ul>PARC |
  3. 3. World of Warcraft PARC |
  4. 4. Life in WoW <ul><li>Basic Game-Play </li></ul><ul><ul><li>Kill Monsters > Get Loot > Buy Better Weapons > Kill Bigger Monsters </li></ul></ul><ul><li>Collaborative </li></ul><ul><ul><li>Tank / Healer / Damage (DPS) </li></ul></ul><ul><ul><li>Game Progression: Pairs > Teams > 10-man > 25-man </li></ul></ul><ul><li>Demographics </li></ul><ul><ul><li>Average Age: 30 </li></ul></ul><ul><ul><li>Average Hours/Week: 20 </li></ul></ul><ul><ul><li>Gender Ratio: 80-20 </li></ul></ul>PARC |
  5. 5. Digital Panopticons <ul><li>Engaged players make Character-Revealing Decisions </li></ul><ul><ul><li>Time investment: 20 hours/week </li></ul></ul><ul><ul><li>Emotional investment: 27% of respondents said most satisfying thing in past 7 days happened in game. 33% in terms of most infuriating. </li></ul></ul><ul><li>Embedded with Pervasive Sensors </li></ul><ul><ul><li>WoW example: # of hugs your character has given out. </li></ul></ul><ul><li>24/7 Continuous Tracking </li></ul><ul><li>Unobtrusive Tracking </li></ul><ul><ul><li>Unlike standard personality surveys </li></ul></ul>PARC |
  6. 6. “ In those days a decree went out from Caesar Augustus that all the world should be registered…” <ul><li>PlayOn 1.0: the Census Bots </li></ul><ul><li>“ Only 6 variables… plus the “large” variable of time </li></ul><ul><li>Hundreds of guild social networks tracked over several years </li></ul>PARC |
  7. 7. Mapping Social Networks <ul><li>Simple heuristic </li></ul><ul><ul><li>Same zone, same time, same guild </li></ul></ul><ul><li>Descriptive stats </li></ul><ul><ul><li>Size, distribution of class / race / level, etc. </li></ul></ul><ul><li>Modeling survival </li></ul><ul><ul><li>Two samples at 6-month interval: who died? </li></ul></ul>PARC |
  8. 8. Small is Beautiful <ul><li>Limits to collective action online </li></ul><ul><ul><li>New “online Dunbar number” of 35? </li></ul></ul><ul><li>Fundamental properties of groups </li></ul><ul><ul><li>Churn, entropy </li></ul></ul><ul><ul><li>Organic, team-based structure more likely to survive (pre-raid) </li></ul></ul><ul><ul><li>Altruism </li></ul></ul><ul><ul><li>Command-and-control for raiding </li></ul></ul>
  9. 9. From Analysis to Action: The Social Dashboard PARC | Tools to help scientists, game producers, and players better understand these communities Assess the social health of a community at a glance
  10. 10. From Analysis to Action: The Social Dashboard PARC | Tools to help scientists, game producers, and players better understand these communities Drill down to identify endangered groups
  11. 11. From Analysis to Action: The Social Dashboard PARC | Tools to help scientists, game producers, and players better understand these communities Observe the evolution of key players over time
  12. 12. On the Internet, Nobody Knows You ’re a Dog… Or Do They? <ul><li>PlayOn 2.0: exploring real world and virtual world linkages </li></ul><ul><li>Widespread belief that “identity play” is prevalent and links between virtual and real are tenuous at best </li></ul><ul><li>Important business implications: predicting socio-demographics from online behavior </li></ul>PARC |
  13. 13. Our Approach <ul><li>Start with survey component </li></ul><ul><ul><li>Ask for in-game characters </li></ul></ul><ul><ul><li>Self-report of RW characteristics </li></ul></ul><ul><li>Cross-Cultural Analysis </li></ul><ul><ul><li>Collaborators in Taiwan and Hong Kong </li></ul></ul><ul><li>Run Data Scrapers </li></ul><ul><ul><li>Collect longitudinal data from in-game and WoW Armory </li></ul></ul>PARC |
  14. 14. Reliable, Large Scale In-Game Data Collection PARC | PlayOn 1.0 PlayOn 2.0 Coverage Entire population on 5 US WoW servers 100s of characters spread across 280 servers in US, TW, HK Infrastructure 5 old PCs 10 virtual machines per 2.26Ghz Quad Core Mac Pro Code Static scripts and configuration Data collection code dynamically generated at runtime Data Local CVS files Stateless VMs with central SQL database
  15. 15. A New Data Source: The Armory <ul><li>Interactive Web app displaying character stats </li></ul><ul><ul><li>Range from amount of damage done to number of hugs given </li></ul></ul><ul><li>Daily jobs to ‘stitch’multiple XML files per character </li></ul><ul><ul><li>Python/Django on two Mac Minis </li></ul></ul><ul><ul><li>Extract to SQL using XPATH queries </li></ul></ul><ul><ul><li>Biggest issue: unknown rate limits </li></ul></ul>PARC |
  16. 16. RW + VW Variables <ul><li>PlayOn 1.0: VW -> VW </li></ul><ul><ul><li>Social Network Metrics -> Guild Survival </li></ul></ul><ul><li>RW -> VW </li></ul><ul><ul><li>Gender differences in game-play </li></ul></ul><ul><ul><li>Cultural / Personality differences </li></ul></ul><ul><li>VW -> RW </li></ul><ul><ul><li>Can we use virtual behaviors to predict RW gender or personality? </li></ul></ul>PARC |
  17. 17. Deluge of Variables <ul><li>“ All the variables” is a dangerous request </li></ul><ul><ul><li>Opposite situation of PlayOn 1.0 </li></ul></ul><ul><li>Extract or generate high-level variables </li></ul><ul><ul><li>Instead of specific zones, focus on percentage of all zones visited </li></ul></ul><ul><li>Aggregates reduce statistical variance </li></ul><ul><ul><li>Aggregates are more stable </li></ul></ul><ul><ul><li>Leads to more robust statistical tests </li></ul></ul><ul><li>Map to more meaningful psychological concepts </li></ul><ul><ul><li>Visiting any one zone is difficult to interpret </li></ul></ul><ul><ul><li>Percentage of all zones visited maps more cleanly to a concept of geographical exploration </li></ul></ul>PARC |
  18. 18. The Alts Problem <ul><li>Players have main characters and alternate characters </li></ul><ul><ul><li># of alternate characters varies </li></ul></ul><ul><li>Aggregating characters </li></ul><ul><ul><li>Most metrics don ’t scale linearly </li></ul></ul><ul><ul><li>Can ’t simply average lvl 60 and lvl 80 </li></ul></ul><ul><li>Same problem comparing between players </li></ul><ul><ul><li>If Player A has lvl 60 and Player B has lvl 80 </li></ul></ul><ul><ul><li>Have to filter out that noise before analysis </li></ul></ul>PARC |
  19. 19. Normalization Strategies <ul><li>Static character attributes (e.g., Character Gender) </li></ul><ul><ul><li>Normalize against total # of characters </li></ul></ul><ul><ul><li>Male Ratio = # of Male Characters / # of Total Characters </li></ul></ul><ul><li>Variable character attributes (e.g., Combat Role) </li></ul><ul><ul><li>Normalize against time played </li></ul></ul><ul><ul><li>Tank Ratio = Time Tanking / Total Playing Time </li></ul></ul><ul><li>Partitioned Variables (e.g., Healing Done) </li></ul><ul><ul><li>Normalize against internal aggregate variable </li></ul></ul><ul><ul><li>Exploration Ratio = Exploration Achievements / Total Achievements </li></ul></ul><ul><ul><li>Healing Ratio = Total Healing Done / Total Damage Done </li></ul></ul>PARC |
  20. 20. Normalization Strategies <ul><li>Not possible to normalize and highly dependent on character level </li></ul><ul><ul><li>E.g., # of vanity pets / mounts </li></ul></ul><ul><ul><li>Extract the maximum </li></ul></ul><ul><ul><li>1 Character with 80 mounts vs. 4 Characters with 20 mounts each </li></ul></ul><ul><li>Not possible to normalize but not dependent on character level </li></ul><ul><ul><li>E.g., # of /hugs </li></ul></ul><ul><ul><li>Any character of any level can hug as often as they ’ d like </li></ul></ul><ul><ul><li>Calculate the sum across all their characters </li></ul></ul>PARC |
  21. 21. Gender-Bending PARC |
  22. 22. Hugs PARC |
  23. 23. Role Preference PARC |
  24. 24. Association Rule Mining & HotSpot <ul><li>Association Rule Mining </li></ul><ul><ul><li>Find item sets that occur with high frequency </li></ul></ul><ul><ul><li>Originally used for basket analysis (e.g., supermarkets) </li></ul></ul><ul><ul><li>{milk, bread} -> {butter} </li></ul></ul><ul><li>HotSpot </li></ul><ul><ul><li>Special implementation in Weka </li></ul></ul><ul><ul><li>Capable of handling numeric attributes </li></ul></ul><ul><ul><li>Allows specification of target class </li></ul></ul>PARC |
  25. 25. HotSpot Rules (Gender) PARC | Male Female Condition 1 Condition 2 Precision Recall % of Male Chars > 30% 94% 81% % of PvP Achs > 7% 81% 82% Sum Hugs < 41 80% 82% Condition 1 Condition 2 Condition 3 P R Male Chars = 0 % Profession Achs > 9% % Tank <= 37% 81% 57% % Male Chars <= 13% % Profession Achs > 9% % PvP Achs < 13% 80% 57%
  26. 26. Rules Classifier <ul><li>Predicting the entire class </li></ul><ul><ul><li>JRip algorithm in Weka (rule learner) </li></ul></ul><ul><ul><li>10-fold cross-validation </li></ul></ul>PARC | Class Precision Recall F-Measure Male .92 .91 .91 Female .76 .77 .76 Overall .88 .87 .88 Classified As Male Female Male 695 67 Female 63 210
  27. 27. Region/Personality Findings <ul><li>Agreeableness (Trusting / Friendly) </li></ul><ul><ul><li>High: More hugs, more cheers, more crafting professions, less PvP </li></ul></ul><ul><ul><li>Low: More dungeons, more PvP, fewer hugs </li></ul></ul><ul><li>Openness (New Ideas / Abstract Thinking) </li></ul><ul><ul><li>High: More exploration, more non-combat achievements </li></ul></ul><ul><ul><li>Low: Less exploration, more combat/dungeon achievements </li></ul></ul><ul><li>Region </li></ul><ul><ul><li>HK+TW: More need rolls, fewer cheers and waves, more dungeons </li></ul></ul><ul><ul><li>US: Less need rolls, more cheers and waves, fewer dungeons and raids </li></ul></ul>PARC |
  28. 28. Business Application: Predictive Social Analytics (PSA) <ul><li>The problem: effective online marketing depends on a broad range of offline variables </li></ul><ul><li>Predictive Social Analytics (PSA): infer socio-psychological traits purely from online behaviors </li></ul><ul><li>Predicted variables: from basic (age, gender) to more complex (personality, education, ethnicity) </li></ul><ul><li>Broad range of data sources: virtual worlds, Web 2.0 sites, enterprise networks… </li></ul>PARC |
  29. 29. From Vision to Reality <ul><li>Validation: is it even feasible to infer socio-psychological traits from online behaviors? </li></ul><ul><li>Specification: which particular set of traits can be reliably inferred? How? </li></ul><ul><li>Generalization: are the predictive variables from virtual worlds transferrable to other online spaces? </li></ul><ul><li>Implementation: actionable tools for automated segmentation, targeted marketing, and other business intelligence </li></ul>PARC |
  30. 30. From Vision to Reality PARC |

×