AWS Game Analytics - GDC 2014

5,184 views

Published on

Use AWS to learn how much players love your game by analyzing in-game metrics to measure engagement and retention. Start simple by uploading data to S3 and analyzing it with Redshift. Add additional game data sources and dive deeper with Cohort analysis. Finally I cover real-time analytics with Kinesis and Spark.

Published in: Technology
0 Comments
16 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,184
On SlideShare
0
From Embeds
0
Number of Embeds
611
Actions
Shares
0
Downloads
124
Comments
0
Likes
16
Embeds 0
No embeds

No notes for slide

AWS Game Analytics - GDC 2014

  1. 1. AWS Gaming Solutions | GDC 2014 Game Analytics with AWS Or, How to learn what your players love so they will love your game Nate Wiger @nateware | Principal Gaming Solutions Architect
  2. 2. AWS Gaming Solutions | GDC 2014 Mobile Game Landscape •  Free To Play •  In-App Purchases •  Long-Tail •  Cross-Platform •  Go Global •  User Retention = Revenue
  3. 3. AWS Gaming Solutions | GDC 2014 Projected Mobile App Revenue 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 2011 2012 2013 2014 2015 2016 2017 Ads IAP Paid Source: Gartner
  4. 4. AWS Gaming Solutions | GDC 2014 Winning at Free to Play •  Phase 1: Collect Data •  Phase 2: Analyze •  Phase 3: Profit
  5. 5. AWS Gaming Solutions | GDC 2014 Analyze What? Emotions •  Enjoying game •  Engaged •  Like/dislike new content •  Stuck on a level •  Bored •  Abandonment Behaviors •  Hours played day/week •  Number of sessions/day •  Level progression •  Friend invites/referrals •  Response to mobile push •  Money spent/week
  6. 6. AWS Gaming Solutions | GDC 2014 Example: Level Progression (One Metric) 0 2 4 6 8 10 L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 Tries / Level # of Tries
  7. 7. AWS Gaming Solutions | GDC 2014 Example: Level Progression (Two Metrics) 0 10 20 30 40 50 60 0 2 4 6 8 10 L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 Tries / Level % Highest Level # of Tries
  8. 8. AWS Gaming Solutions | GDC 2014 Key Takeaways •  Multiple data sources •  Correlate variables •  Deltas vs absolutes •  Settle on terminology (game vs level) •  Time matters
  9. 9. AWS Gaming Solutions | GDC 2014
  10. 10. AWS Gaming Solutions | GDC 2014 Events & Metrics •  Event = Moment in Time –  Login/quit –  Game start/end –  Level up –  In-app purchase •  Metrics = What to Measure –  KISS –  Numbers –  Booleans –  Strings (Enums) •  Always Include (ALWAYS) –  User –  Action –  Session (context-dependent) –  Timestamp in ISO8601 2014-­‐03-­‐16T16:28:26
  11. 11. AWS Gaming Solutions | GDC 2014 Off The Shelf Analytics •  Easy To Integrate •  Pre-Baked Reports •  Rate Limits •  Retention Windows •  Data Lock-In
  12. 12. AWS Gaming Solutions | GDC 2014 Ok, A Real Business Plan Ingest Store Process Analyze
  13. 13. AWS Gaming Solutions | GDC 2014 Ok, A Real Business Plan Ingest •  HTTP PUT •  Kafka •  Kinesis •  Scribe Store •  S3 •  DynamoDB •  HDFS •  Redshift Process •  EMR (Hadoop) •  Spark •  Storm Analyze •  Tableau •  Pentaho •  Jaspersoft
  14. 14. AWS Gaming Solutions | GDC 2014 •  Write Events File on Device •  Periodically Upload to S3 •  Process into Redshift •  Point GUI Tool to Redshift Start Simple 2014-­‐01-­‐24,nateware,e4df,login   2014-­‐01-­‐24,nateware,e4df,gamestart   2014-­‐01-­‐24,nateware,e4df,gameend   2014-­‐01-­‐25,nateware,a88c,login   2014-­‐01-­‐25,nateware,a88c,friendlist   2014-­‐01-­‐25,nateware,a88c,gamestart   Profit!
  15. 15. AWS Gaming Solutions | GDC 2014 Redshift at a Glance 10 GigE (HPC) Ingestion Backup Restore SQL Clients/BI Tools 128GB RAM 16TB disk 16 cores Amazon S3/DynamoDB JDBC/ODBC 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node Leader Node •  Leader Node –  SQL endpoint –  Stores metadata –  Coordinates query execution •  Compute Nodes –  Columnar table storage –  Load, backup, restore via Amazon S3 –  Parallel load from Amazon DynamoDB •  Single node version available
  16. 16. AWS Gaming Solutions | GDC 2014 Tableau + Redshift
  17. 17. AWS Gaming Solutions | GDC 2014 Plumbing ①  Create S3 bucket ("mygame-analytics-events") ②  Request a security token for your mobile app: http://docs.aws.amazon.com/STS/latest/UsingSTS/Welcome.html ③  Upload data from your users' devices ④  Run a scheduled copy to Redshift ⑤  Setup Tableau to access Redshift ⑥  Go to the Beach
  18. 18. AWS Gaming Solutions | GDC 2014 Loading Redshift from S3 copy  events   from  's3://mygame-­‐analytics-­‐events'   credentials  'aws_access_key_id=<access-­‐key-­‐id>;   aws_secret_access_key=<secret-­‐access-­‐key>'   delimiter=',';   Scheduled Redshift Load using Data Pipeline: http://aws.amazon.com/articles/1143507459230804
  19. 19. AWS Gaming Solutions | GDC 2014 •  Also Collect Server Logs •  Periodically Upload to S3 •  Stuff into Redshift •  External Analytics Data Too More Data Sources EC2 External Analytics
  20. 20. AWS Gaming Solutions | GDC 2014 Logrotate to S3 /var/log/apache2/*.log  {      sharedscripts      postrotate          sudo  /usr/sbin/apache2ctl  graceful          s3cmd  sync  /var/log/*.gz  s3://mygame-­‐logs/      endscript   }   Blog Entry on Log Rotation: http://www.dowdandassociates.com/blog/content/howto-rotate-logs-to-s3/ And/or, Use ELB Access Logs: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/ access-log-collection.html
  21. 21. AWS Gaming Solutions | GDC 2014 •  Different File Formats •  Device vs Apache vs CDN •  Cleanup with EMR Job •  Output to Clean Bucket •  Load into Redshift Dealing With Messy Data EC2
  22. 22. AWS Gaming Solutions | GDC 2014 Redshift vs Elastic MapReduce Redshift •  Columnar DB •  Familiar SQL •  Structured Data •  Batch Load •  Faster to Query •  Long-term Storage Elastic MapReduce •  Hadoop •  Hive/Pig are SQL-like •  Unstructured Data •  Streaming Loop •  Scales > PB's •  Transient
  23. 23. AWS Gaming Solutions | GDC 2014 •  Integrate Game DB •  Load Directly into Redshift •  Redshift does Intelligent Merge •  Tracks Hash Keys, Columns Direct From DynamoDB EC2
  24. 24. AWS Gaming Solutions | GDC 2014 •  Integrate Game DB •  Load Directly into Redshift •  Redshift does Intelligent Merge •  Tracks Hash Keys, Columns •  Or Stream into EMR Direct From DynamoDB EC2
  25. 25. AWS Gaming Solutions | GDC 2014 Loading Redshift from DynamoDB copy  games   from  'dynamodb://games'   credentials  'aws_access_key_id=<access-­‐key-­‐id>;   aws_secret_access_key=<secret-­‐access-­‐key>';   copy  events   from  's3://mygame-­‐analytics-­‐events'   credentials  'aws_access_key_id=<access-­‐key-­‐id>;   aws_secret_access_key=<secret-­‐access-­‐key>'   delimiter=',';  
  26. 26. AWS Gaming Solutions | GDC 2014
  27. 27. AWS Gaming Solutions | GDC 2014 Funnel Cake
  28. 28. AWS Gaming Solutions | GDC 2014 Back To Basics 2014-­‐01-­‐24,nateware,e4df,login   2014-­‐01-­‐24,nateware,e4df,gamestart   2014-­‐01-­‐24,nateware,e4df,gameend   2014-­‐01-­‐25,nateware,a88c,login   2014-­‐01-­‐25,nateware,a88c,friendlist   2014-­‐01-­‐25,nateware,a88c,gamestart  
  29. 29. AWS Gaming Solutions | GDC 2014 Measure Retention: Repeated Plays create  view  events_by_user_by_month  as   select  user_id,   date_trunc('month',  event_date)   as  month_active,   count(*)  as  total_events   from  events   group  by  user_id,  month_active;    
  30. 30. AWS Gaming Solutions | GDC 2014 First-Pass Retention – Too Noisy 0 5 10 15 20 25 30 35 40 # Play Sessions / Month nateware Lazyd0g AK187 3strikes
  31. 31. AWS Gaming Solutions | GDC 2014 Cohorts & Cambria •  Enables calculating relative metrics •  Group users by a common attribute –  Month game installed –  Demographics •  Run analysis by cohort –  Join with metrics •  Use Redshift as it's SQL –  Example of where SQL is a good fit
  32. 32. AWS Gaming Solutions | GDC 2014 Creating Cohorts with Redshift create  view  cohort_by_first_event_date  as   select  user_id,   date_trunc('month',  min(event_date))   as  first_month   from  events   group  by  user_id;     http://snowplowanalytics.com/analytics/customer- analytics/cohort-analysis.html
  33. 33. AWS Gaming Solutions | GDC 2014 Retention by Cohort – Join Events with Cohort 0 5 10 15 20 25 Week 1 Week 2 Week 3 Week 5 Week 6 Week 7 # Sessions / Week 2013-11 2013-12 2014-01 2014-02 2014-03 2014-04
  34. 34. AWS Gaming Solutions | GDC 2014 Moar Cohorts •  Define multiple cohorts –  By activity, time, demographics –  As many as you like •  Change cohort depending on analysis •  Join same metrics with different cohorts –  Retention by date –  Retention by demographic –  Retention by average plays/month quartile
  35. 35. AWS Gaming Solutions | GDC 2014 Example Event Stream 2014-­‐03-­‐17T09:52:08-­‐07:00,nateware,e4b5,login   2014-­‐03-­‐17T09:52:54-­‐07:00,nateware,e4b5,gamestart   2014-­‐03-­‐17T09:53:15-­‐07:00,nateware,e4b5,levelup   2014-­‐03-­‐17T09:54:06-­‐07:00,nateware,e4b5,gameend   2014-­‐03-­‐17T09:54:23-­‐07:00,nateware,30a4,gamestart   2014-­‐03-­‐17T09:55:14-­‐07:00,nateware,30a4,gameend   2014-­‐03-­‐17T09:55:41-­‐07:00,nateware,30a4,gamestart   2014-­‐03-­‐17T09:57:12-­‐07:00,nateware,6ebd,levelup   2014-­‐03-­‐17T09:58:50-­‐07:00,nateware,6ebd,levelup   2014-­‐03-­‐17T09:59:52-­‐07:00,nateware,6ebd,gameend    
  36. 36. AWS Gaming Solutions | GDC 2014 Example Event Stream 2014-­‐03-­‐17T09:52:08-­‐07:00,nateware,e4b5,login   2014-­‐03-­‐17T09:52:54-­‐07:00,nateware,e4b5,gamestart   2014-­‐03-­‐17T09:53:15-­‐07:00,nateware,e4b5,levelup   2014-­‐03-­‐17T09:54:06-­‐07:00,nateware,e4b5,gameend   2014-­‐03-­‐17T09:54:23-­‐07:00,nateware,30a4,gamestart   2014-­‐03-­‐17T09:55:14-­‐07:00,nateware,30a4,gameend   2014-­‐03-­‐17T09:55:41-­‐07:00,nateware,30a4,gamestart   2014-­‐03-­‐17T09:57:12-­‐07:00,nateware,6ebd,levelup   2014-­‐03-­‐17T09:58:50-­‐07:00,nateware,6ebd,levelup   2014-­‐03-­‐17T09:59:52-­‐07:00,nateware,6ebd,gameend    
  37. 37. AWS Gaming Solutions | GDC 2014 Cohorts by Type of Activity create  view  cohort_by_first_play_date  as   select  user_id,   date_trunc('month',  min(event_date))   as  first_month   from  events   where  action  =  'gamestart'   group  by  user_id;    
  38. 38. AWS Gaming Solutions | GDC 2014
  39. 39. AWS Gaming Solutions | GDC 2014 Post-Match Heatmaps
  40. 40. AWS Gaming Solutions | GDC 2014 Real-Time Analytics Batch •  What game modes do people like best? •  How many people have downloaded DLC pack 2? •  Where do most people die on map 4? •  How many daily players are there on average? Real-Time •  What game modes are people playing now? •  Are more or less people downloading DLC today? •  Are people dying in the same places? Different? •  How many people are playing today? Variance?
  41. 41. AWS Gaming Solutions | GDC 2014 Why Real-Time Analytics? 30x in 24 hours What if you ran a promo?
  42. 42. AWS Gaming Solutions | GDC 2014 Real-Time Tools Spark •  High-Performance Hadoop Alternative •  Berkeley.edu •  Compatible with HiveQL •  100x faster than Hadoop •  Runs on EMR Kinesis •  Amazon fully-managed streaming data layer •  Similar to Kafka •  Streams contain Shards •  Each Shard ingests data up to 1MB/sec, 1000 TPS •  Data stored for 24 hours
  43. 43. AWS Gaming Solutions | GDC 2014 •  Always Batch Due to S3 Back To Basics [Dubstep Remix] EC2
  44. 44. AWS Gaming Solutions | GDC 2014 •  Stream Data With Kinesis •  Multiple Writers and Readers •  Still Output to Redshift Need Data Faster! EC2
  45. 45. AWS Gaming Solutions | GDC 2014 •  Stream Data With Kinesis •  Multiple Writers and Readers •  Still Output to Redshift •  Stream to Spark on EMR •  Storm via Kinesis Spout •  Custom EC2 Workers Lots of Ins and Outs EC2 EC2
  46. 46. AWS Gaming Solutions | GDC 2014  Data   Sources   App.4     [Machine   Learning]                                       AWS  Endpoint   App.1     [Aggregate  &   De-­‐Duplicate]    Data   Sources   Data   Sources    Data   Sources   App.2     [Metric   Extrac=on]   S3 DynamoDB Redshift App.3   [Sliding   Window   Analysis]    Data   Sources   Availability Zone Shard 1 Shard 2 Shard N Availability Zone Availability Zone Introducing Amazon Kinesis Service for Real-Time Big Data Ingestion
  47. 47. AWS Gaming Solutions | GDC 2014 Putting Data into Kinesis •  Producers use PUT to send data to a Stream •  PutRecord {Data, PartitionKey, StreamName} •  Partition Key distributes PUTs across Shards •  Unique Sequence # returned on PUT call •  Documentation: http://docs.aws.amazon.com/kinesis/latest/dev/ introduction.html Producer Shard 1 Shard 2 Shard 3 Shard n Shard 4 Producer Producer Producer Producer Producer Producer Producer Producer Kinesis
  48. 48. AWS Gaming Solutions | GDC 2014 Writing to a Kinesis Stream POST  /  HTTP/1.1   Host:  kinesis.<region>.<domain>   x-­‐amz-­‐Date:  <Date>   Authorization:  AWS4-­‐HMAC-­‐SHA256  Credential=<Credential>,  SignedHeaders=content-­‐ type;date;host;user-­‐agent;x-­‐amz-­‐date;x-­‐amz-­‐target;x-­‐amzn-­‐requestid,   Signature=<Signature>   User-­‐Agent:  <UserAgentString>   Content-­‐Type:  application/x-­‐amz-­‐json-­‐1.1   Content-­‐Length:  <PayloadSizeBytes>   Connection:  Keep-­‐Alive   X-­‐Amz-­‐Target:  Kinesis_20131202.PutRecord     {      "StreamName":  "exampleStreamName",      "Data":  "XzxkYXRhPl8x",      "PartitionKey":  "partitionKey"   }  
  49. 49. AWS Gaming Solutions | GDC 2014 Kinesis + Spark http://aws.amazon.com/articles/4926593393724923
  50. 50. AWS Gaming Solutions | GDC 2014 Death in Real-Time PUT  "kills"  {"game_id":"e4b5","map":"Boston","killer":38,"victim":39,"coord":"274,591,48"}   PUT  "kills"  {"game_id":"e4b5","map":"Boston","killer":13,"victim":27,"coord":"101,206,35"}   PUT  "kills"  {"game_id":"e4b5","map":"Boston","killer":38,"victim":39,"coord":"165,609,17"}   PUT  "kills"  {"game_id":"e4b5","map":"Boston","killer":6,"victim":29,"coord":"120,422,26"}   PUT  "kills"  {"game_id":"30a4","map":"Los  Angeles","killer":34,"victim":18,"coord":"163,677,18"}   PUT  "kills"  {"game_id":"30a4","map":"Los  Angeles","killer":20,"victim":37,"coord":"71,473,20"}   PUT  "kills"  {"game_id":"30a4","map":"Los  Angeles","killer":21,"victim":19,"coord":"332,381,17"}   PUT  "kills"  {"game_id":"30a4","map":"Los  Angeles","killer":0,"victim":10,"coord":"14,108,25"}   PUT  "kills"  {"game_id":"6ebd","map":"Seattle","killer":32,"victim":18,"coord":"13,685,32"}   PUT  "kills"  {"game_id":"6ebd","map":"Seattle","killer":7,"victim":14,"coord":"16,233,16"}   PUT  "kills"  {"game_id":"6ebd","map":"Seattle","killer":27,"victim":19,"coord":"16,498,29"}   PUT  "kills"  {"game_id":"6ebd","map":"Seattle","killer":1,"victim":38,"coord":"138,732,21"}  
  51. 51. AWS Gaming Solutions | GDC 2014 Real-Time Heatmaps
  52. 52. AWS Gaming Solutions | GDC 2014 But A Bow On It •  Collect data from the start •  Store it even if you can't process it (yet) •  Start simple – S3 + Redshift •  Add data sources – process with EMR •  Real-time – Kinesis + Spark •  Tons of untapped potential for gaming
  53. 53. AWS Gaming Solutions | GDC 2014 Fallback Plan Cheers – Nate Wiger @nateware

×