1Dataiku5/22/2013
5/22/2013Dataiku 2
Collocation
Big Apple
Big Mama
Big Data
Games Analytics
Current Life:
CEO, Dataiku
Tweet about this
@dataiku
@capital_games
Past Life:
Criteo
IsCool Entertainment
Exalead
Hello, My Name is
Florian Douetteau
Available on:
http://www.slideshare.net/Dataiku
The Stakes - Summary
5/22/2013Dataiku 3
Million Events
Billion $
Billion Events
Million $
Classic Business
Social Gaming
Meet Hal Alowne
5/22/2013Dataiku 4
Big Guys
• 100M$+ Revenue
• 10M+ games
• 10+ Data Scientist
Hal Alowne
BI Manager
Dim’s Private Showroom
Hey Hal ! We need
a big data platform
like the big guys.
Let’s just do as they do!
‟
”European Online Game Leader
• 10M$ Revenue
• 1 Million monthly games
• 1 Data Analyst (Hal Himself)
Wave Pox
CEO & Founder
W’ave G’ ames
Big Data
Copy Cat
Project
MERIT = TIME + ROI
5/22/2013Dataiku 5
Targeted
Newsletter
For New Comers
Facebook
Campaign
Optimization
Adapted Product
/ Promotions
TIME : 6 MONTHS ROI : APPS
 Build a lab in 6 months
(rather than 18 months)
Find the right
people
(6 months?)
Choose the
technology
(6 months?)
Make it work
(6 months?)
Build the lab
(6 months)
 Deploy apps
that actually deliver value
2013 2014
2013
• Train People
• Reuse working patterns
Our Goal
5/22/2013Dataiku 6
It’s utterly complex and
unreasonable
Our Goal
5/22/2013Dataiku 7
It’s utterly complex and
unreasonable
Our Goal:
Change his perspective
on data science projects
(sorry, we couldn’t
find a picture of Hal
Smiling)
 Do the Basics
 Understand Analytics
 What to expect out of analytics
Quick Agenda
5/22/2013Dataiku 8
5/22/2013Dataiku 9
 Do you track ?
◦ Customer Goals For
most important
features
◦ Time Spent
Level Progresison
Money Spent
◦ Campaigns and
generated campaign
Value
5/22/2013Dataiku 10
Suggestion #1
Check The Basics
 Do A/B Tests
◦ Use Proven Solutions
◦ Start small (button size
and color)
◦ Check Impacts
◦ Treat new and existing
users differently
◦ Don’t give up after the
first A/B Test
5/22/2013Dataiku 11
Suggestion #2
DO A/B Tests (and not yourself)
 Register Now / Give
Email Graphics:
From 25% to 2X More
Clicks
http://bit.ly/VOruXt
 Changing button
from green to red:
Up to 21%
http://bit.ly/qFEBdK
5/22/2013Dataiku 12
Some Results
A/B Tests
Statistical Signifiance
5/22/2013Dataiku 13
http://visualwebsiteoptimizer.com/ab-split-significance-calculator/
 Can be Built on top
of your production
systems
 Do you have
◦ Cohorts
◦ Daily $$ Reports
◦ Basic $$ Segments
5/22/2013Dataiku 14
Suggestion #3
Have the Basic BI
 Defined Customer Segments
◦ New Installs
◦ Engaged Users
◦ Engaged Paying Users
◦ …?
 Defined Customer Sources
◦ Social Ads / Social Posts / .. Top Charts
/ …
◦ Country Segments
 Do you have for each segment, evey
day
◦ Rolling last 30 days ARPUU ?
◦ Rolling last 30 days DAY ?
 Do you follow every week
◦ The Segment Conversion Rate per
source ?
5/22/2013Dataiku 15
Sample Check list
(Gaming)
Embodiment of Knowledge
5/22/2013Dataiku 16
 Product Success
driven by Quality
 Margin / Customer
Value / Traffic /
Acquisition
5/22/2013Dataiku 17
At the Beginning
 Margin for new
customers might
decline …
 Margin for new
features might
decline …
 Is your business
really scalable ?
5/22/2013Dataiku 18
But when you continue growing
 Existing Customers
 Existing Product Assets
 Existing Specific
Business Model
 And your KNOWLEDGE
of it
5/22/2013Dataiku 19
Where is your core business
advantage ?
5/22/2013Dataiku 20
Data Driven Business
What your value ?
Number of
Customers
Customer Knowledge
Increase over time with:
- Time spend in your app
- User relationship (network effet)
- Partner / Other Apps Interactions
Your Value
5/22/2013Dataiku 21
To Apply It ?
Product Optimization
Customer Acquisition
Optimization
Recommender/
Targeting for
newsletters
 Dark Side
◦ Technology
 Bright Side
◦ Business
5/22/2013Dataiku 22
Apply It !!
The Dark Side
5/22/2013Dataiku 23
Technology is complex
5/22/2013Dataiku 24
Hadoop
Ceph
Sphere
Cassandra
Spark
Scikit-Learn
Mahout
WEKA
MLBase
RapidMiner
Panda
D3
Crossfilter
InfiniDB
LucidDB
Impala
Elastic Search
SOLR
MongoDB
Riak
Membase
Pig
Hive
Cascading
Talend
Machine Learning
Mystery Land
Scalability CentralNoSQL-Slavia
SQL Colunnar Republic
Vizualization County
Data Clean Wasteland
Statistician Old
House
R
Machine learning is complex
5/22/2013Dataiku 25
 Find People that understand machine learning
and all this stuff
 Try to understand
myself
Plumbing is not complex
(but difficult)
5/22/2013Dataiku 26
Implicit User Data
(Views, Searches…)
Content Data
(Title, Categories, Price, …)
Explicit User Data
(Click, Buy, …)
User Information
(Location, Graph…)
500TB
50TB
1TB
200GB
Transformation
Matrix
Transformation
Predictor
Per User Stats
Per Content Stats
User Similarity
Rank Predictor
Content Similarity
The Bright Side
5/22/2013Dataiku 27
 People  Microsoft Excel
5/22/2013Dataiku 28
How did you build your great
product ?
 Data Team  Data Tools
5/22/2013Dataiku 29
How will you continue growing your
great product(s) ?
The Business Guy
who knows maths
The Crazy Analyst
that reveals patterns
The Coding Guy That
is enthusiastic
 data lab, (n. m): a small group
with all the expertise, including
business minded people,
machine learning knowledge and
the right technology
 A proven organization used by
successful data-driven
companies over the past few
years (eBay, LinkedIn, Walmart…)
TEAM + TOOLS= LAB
5/22/2013Dataiku 30
Short Term Focus Long Term Drive
Business People Optimize Margin, …. Create new business
revenue streams
Marketing People Optimize click ratio Brand awareness and
impact
IT People Make IT work Clean and efficient
Architecture
Data People Get Stats Right, make
predictions
Create Data Driven
Features
It’s just a new team
5/22/2013Dataiku 31
Data
!
Product
Designer
Business
&
Marketing
Engineers
User
Voice
Data Innovation: fill the gap!
5/22/2013Dataiku 32
Targeted campaings
Price optimization
A common ground to
federate your product teams
towards a common goal
Personalized
experience
Quality Assurance
Workload and yield
management
User Feedback (A/B Test)
Continuous improvement
 You can’t
« design »
insights, you
explore and
discover them…
 Iterate quickly
with constant
feedback
 Try a lot, don’t
be afraid to fail!
Free
but not as “free beer”
5/22/2013Dataiku 33
Function
Form
Experience
Emotion
Surprise
Culture
Explore
and Refine
Experiment
Generate
Ideas
Select &
Develop
Enhance
or
Discard
Gather
Feedback
Prepare for some Geeky Porn
5/22/2013Dataiku 34
Classic Columnar Architecture
5/22/2013Dataiku 35
Some data Some Place To
Pour It In
Some Tool To
To Some Maths And Graphs
Classic Columnar Architecture
5/22/2013Dataiku 36
Lots of data Some Place To
Pour It In
Some Tool To
To Some Maths And Graphs
Web Tracking Logs
Raw Server Logs
Order / Product / Customer
Facebook Info
Open Data (Weather, Currency …)
The Corinthian Architecture
5/22/2013Dataiku 37
Lots of data
Some Place
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
And Charts
Some Place To
Pour It In And
Clean / Prepare It
The Corinthian Architecture
5/22/2013Dataiku 38
Lots of data
Some Place
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
And Charts
Some Place To
Pour It In And
Clean / Prepare It
Statistics
Cohorts
Regressions
Bar Charts For Marketing
Nice Infography for you Company Board
The Corinthian Architecture
5/22/2013Dataiku 39
Lots of data
Some Database
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
The One Database won’t
make it all problem
5/22/2013Dataiku 40
Lots of data
Some Database
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
JOIN / Aggregate
Rapid Goup By Computations
Direct Access to the computed Results
to production etc..
The Roman Social Forum
5/22/2013Dataiku 41
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
The Key Value Store
5/22/2013Dataiku 42
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
Action requires Prediction
5/22/2013Dataiku 43
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
Draw A Line  For the future
What are my real users groups ?
Should I launch a discount offering or not ?
To everybody or to specific users only ?
The Medieval Fairy Land
5/22/2013Dataiku 44
Lots of data
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts and some
MACHINE LEARNING
Some Place To
Pour It In And
Clean / Prepare It
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
5/22/2013Dataiku 45
 Launch A Marketing
campaign
 After a few days
PREDICT based on
behaviours
◦  Total ARPU for users
after 3 months
◦  Efficiency of a campaign
◦ Continue or not ?
Example
Marketing Campaign Prediction
Dataiku 46
A very large community
Some mid-size
communities
Lots of small clusters
mostly 2 players)
 Correlation
◦ between community size
and engagement / virality
 Meaningul patterns
◦ 2 players patterns
◦ Family play
◦ Group Play
◦ Open Play (language
community)
Example
Social Gaming Communities
5/22/2013Dataiku 47
 Two-Way Clustering
◦ Assess customer behaviours
◦ Assess items equivalent classes
 Modeling + Simulation
◦ Evaluate free items / item bought
ration per item kind
◦ Simulate future rules
◦ Sensibility to price evaluation
 Enhance customer buy
recurrence
Example
Fremium Model Optimization
5/22/2013Dataiku 48
Business
Model
User
Profiling
Simulation
Questions
5/22/2013Dataiku 49

Online Games Analytics - Data Science for Fun

  • 1.
  • 2.
    5/22/2013Dataiku 2 Collocation Big Apple BigMama Big Data Games Analytics Current Life: CEO, Dataiku Tweet about this @dataiku @capital_games Past Life: Criteo IsCool Entertainment Exalead Hello, My Name is Florian Douetteau Available on: http://www.slideshare.net/Dataiku
  • 3.
    The Stakes -Summary 5/22/2013Dataiku 3 Million Events Billion $ Billion Events Million $ Classic Business Social Gaming
  • 4.
    Meet Hal Alowne 5/22/2013Dataiku4 Big Guys • 100M$+ Revenue • 10M+ games • 10+ Data Scientist Hal Alowne BI Manager Dim’s Private Showroom Hey Hal ! We need a big data platform like the big guys. Let’s just do as they do! ‟ ”European Online Game Leader • 10M$ Revenue • 1 Million monthly games • 1 Data Analyst (Hal Himself) Wave Pox CEO & Founder W’ave G’ ames Big Data Copy Cat Project
  • 5.
    MERIT = TIME+ ROI 5/22/2013Dataiku 5 Targeted Newsletter For New Comers Facebook Campaign Optimization Adapted Product / Promotions TIME : 6 MONTHS ROI : APPS  Build a lab in 6 months (rather than 18 months) Find the right people (6 months?) Choose the technology (6 months?) Make it work (6 months?) Build the lab (6 months)  Deploy apps that actually deliver value 2013 2014 2013 • Train People • Reuse working patterns
  • 6.
    Our Goal 5/22/2013Dataiku 6 It’sutterly complex and unreasonable
  • 7.
    Our Goal 5/22/2013Dataiku 7 It’sutterly complex and unreasonable Our Goal: Change his perspective on data science projects (sorry, we couldn’t find a picture of Hal Smiling)
  • 8.
     Do theBasics  Understand Analytics  What to expect out of analytics Quick Agenda 5/22/2013Dataiku 8
  • 9.
  • 10.
     Do youtrack ? ◦ Customer Goals For most important features ◦ Time Spent Level Progresison Money Spent ◦ Campaigns and generated campaign Value 5/22/2013Dataiku 10 Suggestion #1 Check The Basics
  • 11.
     Do A/BTests ◦ Use Proven Solutions ◦ Start small (button size and color) ◦ Check Impacts ◦ Treat new and existing users differently ◦ Don’t give up after the first A/B Test 5/22/2013Dataiku 11 Suggestion #2 DO A/B Tests (and not yourself)
  • 12.
     Register Now/ Give Email Graphics: From 25% to 2X More Clicks http://bit.ly/VOruXt  Changing button from green to red: Up to 21% http://bit.ly/qFEBdK 5/22/2013Dataiku 12 Some Results A/B Tests
  • 13.
  • 14.
     Can beBuilt on top of your production systems  Do you have ◦ Cohorts ◦ Daily $$ Reports ◦ Basic $$ Segments 5/22/2013Dataiku 14 Suggestion #3 Have the Basic BI
  • 15.
     Defined CustomerSegments ◦ New Installs ◦ Engaged Users ◦ Engaged Paying Users ◦ …?  Defined Customer Sources ◦ Social Ads / Social Posts / .. Top Charts / … ◦ Country Segments  Do you have for each segment, evey day ◦ Rolling last 30 days ARPUU ? ◦ Rolling last 30 days DAY ?  Do you follow every week ◦ The Segment Conversion Rate per source ? 5/22/2013Dataiku 15 Sample Check list (Gaming)
  • 16.
  • 17.
     Product Success drivenby Quality  Margin / Customer Value / Traffic / Acquisition 5/22/2013Dataiku 17 At the Beginning
  • 18.
     Margin fornew customers might decline …  Margin for new features might decline …  Is your business really scalable ? 5/22/2013Dataiku 18 But when you continue growing
  • 19.
     Existing Customers Existing Product Assets  Existing Specific Business Model  And your KNOWLEDGE of it 5/22/2013Dataiku 19 Where is your core business advantage ?
  • 20.
    5/22/2013Dataiku 20 Data DrivenBusiness What your value ? Number of Customers Customer Knowledge Increase over time with: - Time spend in your app - User relationship (network effet) - Partner / Other Apps Interactions Your Value
  • 21.
    5/22/2013Dataiku 21 To ApplyIt ? Product Optimization Customer Acquisition Optimization Recommender/ Targeting for newsletters
  • 22.
     Dark Side ◦Technology  Bright Side ◦ Business 5/22/2013Dataiku 22 Apply It !!
  • 23.
  • 24.
    Technology is complex 5/22/2013Dataiku24 Hadoop Ceph Sphere Cassandra Spark Scikit-Learn Mahout WEKA MLBase RapidMiner Panda D3 Crossfilter InfiniDB LucidDB Impala Elastic Search SOLR MongoDB Riak Membase Pig Hive Cascading Talend Machine Learning Mystery Land Scalability CentralNoSQL-Slavia SQL Colunnar Republic Vizualization County Data Clean Wasteland Statistician Old House R
  • 25.
    Machine learning iscomplex 5/22/2013Dataiku 25  Find People that understand machine learning and all this stuff  Try to understand myself
  • 26.
    Plumbing is notcomplex (but difficult) 5/22/2013Dataiku 26 Implicit User Data (Views, Searches…) Content Data (Title, Categories, Price, …) Explicit User Data (Click, Buy, …) User Information (Location, Graph…) 500TB 50TB 1TB 200GB Transformation Matrix Transformation Predictor Per User Stats Per Content Stats User Similarity Rank Predictor Content Similarity
  • 27.
  • 28.
     People Microsoft Excel 5/22/2013Dataiku 28 How did you build your great product ?
  • 29.
     Data Team Data Tools 5/22/2013Dataiku 29 How will you continue growing your great product(s) ? The Business Guy who knows maths The Crazy Analyst that reveals patterns The Coding Guy That is enthusiastic
  • 30.
     data lab,(n. m): a small group with all the expertise, including business minded people, machine learning knowledge and the right technology  A proven organization used by successful data-driven companies over the past few years (eBay, LinkedIn, Walmart…) TEAM + TOOLS= LAB 5/22/2013Dataiku 30
  • 31.
    Short Term FocusLong Term Drive Business People Optimize Margin, …. Create new business revenue streams Marketing People Optimize click ratio Brand awareness and impact IT People Make IT work Clean and efficient Architecture Data People Get Stats Right, make predictions Create Data Driven Features It’s just a new team 5/22/2013Dataiku 31
  • 32.
    Data ! Product Designer Business & Marketing Engineers User Voice Data Innovation: fillthe gap! 5/22/2013Dataiku 32 Targeted campaings Price optimization A common ground to federate your product teams towards a common goal Personalized experience Quality Assurance Workload and yield management User Feedback (A/B Test) Continuous improvement
  • 33.
     You can’t «design » insights, you explore and discover them…  Iterate quickly with constant feedback  Try a lot, don’t be afraid to fail! Free but not as “free beer” 5/22/2013Dataiku 33 Function Form Experience Emotion Surprise Culture Explore and Refine Experiment Generate Ideas Select & Develop Enhance or Discard Gather Feedback
  • 34.
    Prepare for someGeeky Porn 5/22/2013Dataiku 34
  • 35.
    Classic Columnar Architecture 5/22/2013Dataiku35 Some data Some Place To Pour It In Some Tool To To Some Maths And Graphs
  • 36.
    Classic Columnar Architecture 5/22/2013Dataiku36 Lots of data Some Place To Pour It In Some Tool To To Some Maths And Graphs Web Tracking Logs Raw Server Logs Order / Product / Customer Facebook Info Open Data (Weather, Currency …)
  • 37.
    The Corinthian Architecture 5/22/2013Dataiku37 Lots of data Some Place To Perform Rapid Calculations Some Tools To Do Some Maths And Charts Some Place To Pour It In And Clean / Prepare It
  • 38.
    The Corinthian Architecture 5/22/2013Dataiku38 Lots of data Some Place To Perform Rapid Calculations Some Tools To Do Some Maths And Charts Some Place To Pour It In And Clean / Prepare It Statistics Cohorts Regressions Bar Charts For Marketing Nice Infography for you Company Board
  • 39.
    The Corinthian Architecture 5/22/2013Dataiku39 Lots of data Some Database To Perform Rapid Calculations Some Tools To Do Some Maths Some Other To Do Some Charts Some Place To Pour It In And Clean / Prepare It
  • 40.
    The One Databasewon’t make it all problem 5/22/2013Dataiku 40 Lots of data Some Database To Perform Rapid Calculations Some Tools To Do Some Maths Some Other To Do Some Charts Some Place To Pour It In And Clean / Prepare It JOIN / Aggregate Rapid Goup By Computations Direct Access to the computed Results to production etc..
  • 41.
    The Roman SocialForum 5/22/2013Dataiku 41 Lots of data Some Database To Perform Rapid Calculations And some database for graphs Some Tools To Do Some Maths Some Other To Do Some Charts Some Place To Pour It In And Clean / Prepare It
  • 42.
    The Key ValueStore 5/22/2013Dataiku 42 Lots of data Some Database To Perform Rapid Calculations And some database for graphs And Some Distributed Key Value Store Some Tools To Do Some Maths Some Other To Do Some Charts Some Place To Pour It In And Clean / Prepare It
  • 43.
    Action requires Prediction 5/22/2013Dataiku43 Lots of data Some Database To Perform Rapid Calculations And some database for graphs And Some Distributed Key Value Store Some Tools To Do Some Maths Some Other To Do Some Charts Some Place To Pour It In And Clean / Prepare It Draw A Line  For the future What are my real users groups ? Should I launch a discount offering or not ? To everybody or to specific users only ?
  • 44.
    The Medieval FairyLand 5/22/2013Dataiku 44 Lots of data Some Tools To Do Some Maths Some Other To Do Some Charts and some MACHINE LEARNING Some Place To Pour It In And Clean / Prepare It Some Database To Perform Rapid Calculations And some database for graphs And Some Distributed Key Value Store
  • 45.
  • 46.
     Launch AMarketing campaign  After a few days PREDICT based on behaviours ◦  Total ARPU for users after 3 months ◦  Efficiency of a campaign ◦ Continue or not ? Example Marketing Campaign Prediction Dataiku 46
  • 47.
    A very largecommunity Some mid-size communities Lots of small clusters mostly 2 players)  Correlation ◦ between community size and engagement / virality  Meaningul patterns ◦ 2 players patterns ◦ Family play ◦ Group Play ◦ Open Play (language community) Example Social Gaming Communities 5/22/2013Dataiku 47
  • 48.
     Two-Way Clustering ◦Assess customer behaviours ◦ Assess items equivalent classes  Modeling + Simulation ◦ Evaluate free items / item bought ration per item kind ◦ Simulate future rules ◦ Sensibility to price evaluation  Enhance customer buy recurrence Example Fremium Model Optimization 5/22/2013Dataiku 48 Business Model User Profiling Simulation
  • 49.