Gaming analytics on gcp

Creating a Gaming Analytics Platform
최명근, Cloud Platform Sales Engineer

Confidential & ProprietaryGoogle Cloud Platform 2
Free-to-play mobile gaming is delivered
as a service.
Player engagement is key.

The goal is to have a
unified view of the player.

Diverse Data Sources

Data from user
acquisition
campaigns
Data from Google
Play and App Store
Turnkey gaming
metrics
(e.g. player churn and
spend predictions from
Play Games Services)

User behavior data
from your website
and mobile apps

Custom game events
Custom logs
Custom player telemetry
specific to your games

Continuum of Gaming Analytics
Standard metrics:
● DAU, MAU, ARPPU
● Player Progression
● Feature Engagement
● Spend
● Retention / Churn
● Daily revenue targets
● Fraud and cheating
Key indicators specific to your game:
● Activity in communities, joining
guilds, # of friends in-game
● Reached meaningful milestone
or achievement
● Time to first meaningful transaction
● Player response to specific A/B tests
Turnkey Custom

● How many players made it to stage 12?
● What path did they take through the stage?
● Health and other key stats at this point in time?
● Of the players who took the same route where a
certain condition was true, how many made an in-
app purchase?
● What are the characteristics of the player segment
who didn’t make the purchase vs. those who did?
● Why was this custom event so successful in driving
in-app purchases compared to others?
Ask custom questions

秘密 / 占有情報Google Cloud Platform 10
3 Things to Remember
Speed up from Batch to Real-Time
Speed up Development Time
Speed up Batch Processing1
3
2

Confidential & ProprietaryGoogle Cloud Platform 11Confidential & ProprietaryGoogle Cloud Platform 11
1 - Speed up Batch Processing

"Getting Started" Pattern

Demo

Some of DeNA's Hadoop+Hive woes:
● Many bottlenecks & failure points
● 3 hour data ingestion lag
● Too many analysts at peak time
● Slow queries
● ...

Google confidential │ Do not
distribute
BigQuery is different because

distribute
Analyze TBs /secs

distribute
No servers

distribute
Terabit network

distribute
No pre-planning

distribute
No indexes

distribute
Large joins, any key

distribute
Always On

distribute
Planned downtime?

distribute
Unlimited users

distribute
Stream in <100k rps

distribute
Secure, but shares

distribute
Free monthly quota
( 1TB )

If that's the "Getting Started" pattern...
...what's NEXT?

Cloud Dataflow (Apache Beam) Pattern

Let's Dive Deeper:
Cloud Dataflow (Apache Beam)

Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 35秘密 / 占有情報Google Cloud Platform 35

Dataflow (Apache Beam) Pattern

Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Pipeline p = Pipeline.create();
p.begin()
.apply(TextIO.Read.from(“gs://…”))
.apply(ParDo.of(new ExtractTags())
.apply(Count.create())
.apply(ParDo.of(new ExpandPrefixes())
.apply(Top.largestPerKey(3))
.apply(TextIO.Write.to(“gs://…”));
p.run();
Dataflow goodies

Dataflow goodies
Autoscaling mid-job
1
2
3
4
5

Autoscaling mid-job
1
2
3
4
5
Dataflow goodies
800 RPS 1200 RPS 5000 RPS 50 RPS
*means 100% cluster utilization by definition

Autoscaling mid-job
1
2
3
4
5
Dataflow goodies

Autoscaling mid-job
1
2
3
4
5
Dataflow goodies
Pipeline p = Pipeline.create();
p.begin()
.apply(TextIO.Read.from(“gs://…”))
.apply(ParDo.of(new ExtractTags())
.apply(Count.create())
.apply(ParDo.of(new ExpandPrefixes())
.apply(Top.largestPerKey(3))
.apply(TextIO.Write.to(“gs://…”));
p.run();
.apply(PubsubIO.Read.from(“input_topic”))
.apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))
.apply(PubsubIO.Write.to(“output_topic”));

Confidential & ProprietaryGoogle Cloud Platform 42Confidential & ProprietaryGoogle Cloud Platform 42
2 - Speed up from Batch to Real-time

{
"eventTime":"2016-04-29T05:22:39.201477414Z",
"userId":"user529287@example.com",
"sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0",
"sessionStartTime":"2016-04-29T05:20:44.201477414Z",
"eventId":"playermissednpc",
"currentQuest":156,
"npcId":"boss156",
"battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a",
"playerAttackPoints":15,
"playerHitPoints":2,
"playerMaxHitPoints":15,
"playerArmorClass":15,
"npcAttackPoints":15,
"npcHitPoints":8,
"npcMaxHitPoints":15,
"npcArmorClass":15,
"attackRoll":12
}

{
"eventTime":"2016-04-29T05:22:39.201477414Z",
"userId":"user529287@example.com",
"sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0",
"sessionStartTime":"2016-04-29T05:20:44.201477414Z",
"eventId":"playermissednpc",
"currentQuest":156,
"npcId":"boss156",
"battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a",
"player": [
{"attackPoints":15},
{"hitPoints":2},
{"maxHitPoints":15},
{"armorClass":15} ],
"npcAttackPoints":15,
"npcHitPoints":8,
"npcMaxHitPoints":15,
"npcArmorClass":15,
"attackRoll":12
}

Transform and Load: Cloud Dataflow
Streaming Pipeline
BigQuery
Analytics Engine
Cloud Pub/Sub
Asynchronous messaging
Real
Time
Events
Cloud Dataflow
Parallel data processing
32 4
Streaming Pipeline
iOS
1
Real-time Events
{
...
"userId":"gamer@example.com",
"damageRoll":13,
...
}
{
userId... damageRoll ...
... 13 ...gamer@example.com
... 8 ...player@example.com
{
...
"userId":"player@example.com",
"damageRoll":8,
...
}

Reads game data published in near real-time, and
uses that data to perform two separate processing
tasks:
● Calculates the total score for every unique
user and publishes speculative results for
every ten minutes of processing time.
● Calculates the team scores for each hour that
the pipeline runs using fixed-time windowing..
● In addition, the team score calculation uses
Dataflow's trigger mechanisms to provide
speculative results for each hour (which
update every five minutes until the hour is
up), and to also capture any late data and
add it to the specific hour-long window to
which it belongs.
Leaderboard Example

http://goo.gl/vz1Cj5
● UserScore: Basic Score Processing in Batch
● HourlyTeamScore: Advanced Processing in
Batch with Windowing
● LeaderBoard: Streaming Processing with
Real-Time Game Data
● GameStats: Abuse Detection and Usage
Analysis
Cloud Dataflow and Spark examples
Sample Code on Github

US Mobile Game Company goes Real-time Streaming
Streaming Pipeline
BigQuery
Analytics Engine
Cloud Pub/Sub
Asynchronous messaging
Real
Time
Events
Cloud Dataflow
Parallel data processing
32 4
Streaming Pipeline
iOS
1
Real-time Events

Thing 3: Speed up Development Time

Building what’s next 51
Time to Understanding
Typical Big Data
Processing
Programming
Resource
provisioning
Performance
tuning
Monitoring
Reliability
Deployment &
configuration
Handling
growing scale
Utilization
improvements

Building what’s next 52
Time to Understanding
Big Data with Google:
Focus on insight,
not infrastructure.
Programming

A ample data streaming logic (Dataflow vs Spark)

Confidential & Proprietary 54Google Cloud Platform
speed 10B logs TBs of info 10x faster
Provisions new services
in seconds instead
of days
Google App Engine syncs
with BigQuery to automatically
store tens of billions
of application logs so TabTale
can analyze issues
on a moment's notice
Run queries on terabytes
of information
in a few seconds
Can now deliver new backend
features 10 times faster
without dealing with
infrastructure maintenance
“Our ability to provision new services in seconds saves us a lot of time,
since it used to take days. The gaming industry is characterized by short-
term projects, so it’s important for us to have a backend that is flexible
and works fast.”

7
Architecture breakdown: Batch

Architecture breakdown: Stream
7

GCP gaming telemetry reference architecture
7

6
7
http://goo.gl/IdYxaa

TensorFlow open source
manifestation of our ML capability
Machine Learning - TensorFlow Machine Learning - Vision API
Label / Entity Detection, Facial
Detection, OCR, Logo Detection, Safe
Search
Machine Learning - Cloud Dataproc
Managed Hadoop, Hive, Spark
90 secs to start cluster

Like you, Google is committed to gaming
Use Google’s latest technologies to build,
distribute, and monetize your games

Gaming analytics on gcp

Recommended

Recommended

More Related Content

Similar to Gaming analytics on gcp

Similar to Gaming analytics on gcp (20)

Recently uploaded

Recently uploaded (20)

Gaming analytics on gcp