SlideShare a Scribd company logo
Big Data in Action
“Mining gold from terabytes of gaming data
using Spark & AWS EMR”
29th May 2019, Big Data Athens v 4.0
#AutomagicallyIncreasingRevenue
1
2
Theodoros Michalareas, wappier CTO, lover of technology
& all things geeky, startup advisor.
Working on state-of-the-art audience management &
marketing automation tools for mobile game publishers &
online businesses.
tm@wappier.com
https://www.linkedin.com/in/theodorosmichalareas/
• SaaS platform for Reward
Programs
• Next Best Actions (NBA) offers
Recommendation Engine
• Mobile game discovery &
social networks
Build Loyalty for Games & Businesses
• Global Pricing optimization
• Understand gamers/customers utility
• Real-Time Bundling
Increase Monetization
• Predictive Analytics & ML Models
• Real-Time Consumer Clustering
• In-Depth Insights/Advanced Analytics
• Multivariate Testing & Counterfactual Analysis
Machine Learning / BI Analytics
• Audience Builder: Dynamic Customers Segmentation
• Predictive Consumer Attributes based on Real-Time
Behavior Modeling & Forecasting
• 3D / VR Data Visualization & Manipulation
Visualizations & Audience Management
What We Do – Intelligent Revenue Management
Who We Are – 3.5 Years Startup
Web-Based SaaS Platform
(MEAN+cloud+native SDKs/apps
for iOS/Android + Visualizations)
>1m lines of code
Big Data Infrastructure
(spark-based) manage TBs of data
per app/customer
Machine Learning Framework:
modeling, algorithm selection,
evaluation
Technology
Worked with 10s of game
publishers over the last 3.5 years
Experienced in mobile marketing,
building successful loyalty
programs, customer success
Skilled in Visual Design, Software
Engineering, Big Data Engineering,
BI, Data Science, Live
Ops/marketing
Know-How
3.5-years run, expanding
from 40 to 60 headcount by the
end of 2019
70% of the team in
Engineering & Data Science
Presence: US & Europe
Engineering 1st company,
running code ethos
Team
Our Mission
We are transforming the way
app developers and
marketers maximize
consumer revenue
by using powerful AI
that goes beyond
marketing automation.
Bring the
sophistication in
UA to Revenue
Management
Provide AI
Technology to
predict and
influence player
behavior
Improve
consumer LTV
outside of core
gameplay
Let you
focus on
building the
best app
out there
5
6
Some of our Customers
Maximize Gamers’ Lifetime Value
Segment business customers into
dynamic audiences based on
their probability to churn/buy or
their expected LTV
Acquire
Make your UA budget count
Convert
Retain
More users into engaged players
More players into payers
Extent players lifetime
Extent players monetary value
Increase by
50%
Increase by
30%
7
Finding Gold – a typical description
of a game publisher request
It’s Really Gold
8
Game Title Installs at 6
months
Average Lifetime
Value after 6
months
Estimated revenue
at 6 month
Estimated
incremental
revenue with 5%
additional LTV
Rules of survival 55,728,640 $0.25 $13,932,160 $696,608
Knives Out 46,598,787 $1.66 $77,353,986 $3,867,699
Fortnite 16,106,159 $1.13 $18,199,960 $909,998
Clash Royale 113,076,241 $3.11 $351,667,110 $17,583,355
Puzzle Dragon 145,219 $6.78 $984,585 $49,229
Game of war 5,661,266 $5.51 $31,193,576 $1,559,679
Source: SensorTower
Macroeconomic
[GDP, Exchange Rate, Unemployment Rate, …]
Microeconomic
[Device Price, Housing/Rents, …]
Game Market Statistics
[Revenue, Growth, …]
Mobile Tech Statistics
[Smartphone Penetration, Android vs iOS, …]
Device Context
[Device, Device Price, Resolution, Platform, …]
Game Context
[Genre, Rating, DAUs/MAUs, F2B %, …]
Temporal Elements
[Seasonality, Trends, …]
Gameplay Context
[Level, Sessions per Day, Events per Day, Purchase History, …}
Other Game History
(same publisher) [Purchase History, Engagement, …]
Game Data
Small
publisher
Average
publisher
< 1GB daily
< 1 y to reach 1TB
< 10GB daily
< 4 m to reach
1TB
9
< 50GB daily
< 1 m to reach
1TB
Large
publisher
BUT we need to mine TBs of data per game
Assess
• User reacts to
personalized
recommendation
which results in 30-
50% performance
increase
Recommend
• Platform computes
and recommends
user’s next best action:
• Optimal Tactic
• Optimal Channel
• Optimal Timing
Predict
• Expected user LTV is X
• Expected user next best
tactic is Y (Loyalty AI
Engine)
• Expected user next best
price offer is Z (Pricing
AI Engine)
Analyze
• Data are being
analyzed
• User behavior is
modeled:
• retention curve,
propensity to buy,
probability to
churn, LTV, …
Track/Collect
• User enters game
• Data start being
tracked
• ML algorithms start
being trained
10
Machine Learning
Models We Use
Finding Gold – Our Mining Methodology
Revenue Regression Models
Micro-Level Non-Linear Demand Estimation Models
Behavioral Economics Adjustments (Psychological Pricing)
Multi-Armed Bandit Optimization
11
1. Access more secondary
and tertiary data on which to base
analyses. These data are mainly
structured in nature
2. Data is constantly updating
and streaming
4. Broaden access for non-
experts to Data Engineering
7. Machines are learning,
enabling the results to
contribute to the source data
and inform future decisions
3. New tools are available
that integrate analytics and enable
data exploration and correlations
5. Enable multiple modeling
combinations & iterations
6. Run time wappier platform
enabled tactics
Key
Big Data Infrastructure: ML Workflow & Soft. Stack
Define the
problem
Analyze data,
synthesize
Does data confirm
hypotheses?
Act
Implement,
Measure
Review, Learn
Primary
data
Secondary
data
YesNo
Cycle time reduced to
minutes
1
2
3
4
5
6
7
Staging
Area/Data
Lake
Transform
Extract/Load
Develop
multiple
hypotheses
12
Challenges in Mining Gold / Optimizing Games Revenue
Big Data Volumes per
Customer/Volume
A typical game can range
between 10s of MB of
data to 10s of GB of data
daily – data science
teams need a platform to
support big data
volumes
Variable Number of
Projects - Variety
Variable number of
projects /active
publishers from small to
large – data need to
imported/staged and
transformed as soon as
we have access to them
Cost of Exploration –
Velocity/Veracity
Initial exploratory phases
involve process that need
to access/process big
data – Infrastructure
needs to be able to grow
to support different
workloads
Cooperation between
Teams - Agility
Allow different data
science teams to work
on different data sets
based on security and
auditing rules
13
Auto Scaling
based on
memory
consumption
AWS
CloudTrail
CloudWatch alarm IAM encrypted
data
permissions role
Python (boto)
bucket with
objects
bucket with
objects
bucket with
objects
OrchestrationBig Data Computing
Data Storage Admin
AWS EMR Data Lake Architecture
14
SPARK Rules
Lessons Learned
Unless you have <1TB to
manage cloud-based
solution is a must
AWS EMR has a flexible
deployment model that
can be cost constrained
You need to experiment to
find the best policy to use
EMR autoscaling for your
SPARK workloadI
15
1 2 3
16
PS: We Are Hiring!
https://wappier.com/join-us/
The team is growing!
Thank you!
#Automagically Increasing Revenue
info@wappier.com +1 877 WAPPIER
+44 20 7100 1736
www.wappier.com
17

More Related Content

Similar to Big Data Athens 2019 v 4.0 I “Mining gold from terabytes of gaming data using Spark & AWS EMR" - Theodoros Michalareas

How cloud is fueling growth for online gaming
How cloud is fueling growth for online gamingHow cloud is fueling growth for online gaming
How cloud is fueling growth for online gaming
Blazeclan Technologies Private Limited
 
Accenture-Strategy-Future-of-Analytics-in-Devices-and-Gaming
Accenture-Strategy-Future-of-Analytics-in-Devices-and-GamingAccenture-Strategy-Future-of-Analytics-in-Devices-and-Gaming
Accenture-Strategy-Future-of-Analytics-in-Devices-and-GamingDylan Hoffman
 
Introducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOpsIntroducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOps
James Gwertzman
 
danmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdfdanmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdf
ssuser3ee399
 
Why Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product ManagerWhy Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product Manager
Product School
 
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
DataScienceConferenc1
 
Ways Artificial Intelligence Can Improve Your Business with IBM Watson
Ways Artificial Intelligence Can Improve Your Business with IBM WatsonWays Artificial Intelligence Can Improve Your Business with IBM Watson
Ways Artificial Intelligence Can Improve Your Business with IBM Watson
Markus Van Kempen
 
The Future is Operations: Why Mobile Games Need Backends
The Future is Operations: Why Mobile Games Need BackendsThe Future is Operations: Why Mobile Games Need Backends
The Future is Operations: Why Mobile Games Need Backends
James Gwertzman
 
Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
 Inside the mind of Sports and Energy Industry through Machine Learning - Igo... Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
Institute of Contemporary Sciences
 
Catapult Advisors: Predictive & Advanced Analytics Market Overview
Catapult Advisors: Predictive & Advanced Analytics Market OverviewCatapult Advisors: Predictive & Advanced Analytics Market Overview
Catapult Advisors: Predictive & Advanced Analytics Market Overview
Catapult Advisors
 
Career as a Product Manager / Data Analyst in the Games Industry
Career as a Product Manager / Data Analyst in the Games IndustryCareer as a Product Manager / Data Analyst in the Games Industry
Career as a Product Manager / Data Analyst in the Games Industry
Thomas Hulvershorn
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
nkabra
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
Aravindharamanan S
 
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS WorldSuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
Simo Ahava
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
Jak Marshall
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
Jak Marshall
 
Game Analytics: A Practitioner’s Perspective
Game Analytics: A Practitioner’s PerspectiveGame Analytics: A Practitioner’s Perspective
Game Analytics: A Practitioner’s Perspective
Decimus
 
The Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersThe Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent Offers
Cloudera, Inc.
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.BI
 
Frontiers in Alternative Data : Techniques and Use Cases
Frontiers in Alternative Data : Techniques and Use CasesFrontiers in Alternative Data : Techniques and Use Cases
Frontiers in Alternative Data : Techniques and Use Cases
QuantUniversity
 

Similar to Big Data Athens 2019 v 4.0 I “Mining gold from terabytes of gaming data using Spark & AWS EMR" - Theodoros Michalareas (20)

How cloud is fueling growth for online gaming
How cloud is fueling growth for online gamingHow cloud is fueling growth for online gaming
How cloud is fueling growth for online gaming
 
Accenture-Strategy-Future-of-Analytics-in-Devices-and-Gaming
Accenture-Strategy-Future-of-Analytics-in-Devices-and-GamingAccenture-Strategy-Future-of-Analytics-in-Devices-and-Gaming
Accenture-Strategy-Future-of-Analytics-in-Devices-and-Gaming
 
Introducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOpsIntroducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOps
 
danmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdfdanmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdf
 
Why Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product ManagerWhy Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product Manager
 
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
 
Ways Artificial Intelligence Can Improve Your Business with IBM Watson
Ways Artificial Intelligence Can Improve Your Business with IBM WatsonWays Artificial Intelligence Can Improve Your Business with IBM Watson
Ways Artificial Intelligence Can Improve Your Business with IBM Watson
 
The Future is Operations: Why Mobile Games Need Backends
The Future is Operations: Why Mobile Games Need BackendsThe Future is Operations: Why Mobile Games Need Backends
The Future is Operations: Why Mobile Games Need Backends
 
Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
 Inside the mind of Sports and Energy Industry through Machine Learning - Igo... Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
Inside the mind of Sports and Energy Industry through Machine Learning - Igo...
 
Catapult Advisors: Predictive & Advanced Analytics Market Overview
Catapult Advisors: Predictive & Advanced Analytics Market OverviewCatapult Advisors: Predictive & Advanced Analytics Market Overview
Catapult Advisors: Predictive & Advanced Analytics Market Overview
 
Career as a Product Manager / Data Analyst in the Games Industry
Career as a Product Manager / Data Analyst in the Games IndustryCareer as a Product Manager / Data Analyst in the Games Industry
Career as a Product Manager / Data Analyst in the Games Industry
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS WorldSuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
SuperWeek 2016 - Garbage In Garbage Out: Data Quality in a TMS World
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
 
Game Analytics: A Practitioner’s Perspective
Game Analytics: A Practitioner’s PerspectiveGame Analytics: A Practitioner’s Perspective
Game Analytics: A Practitioner’s Perspective
 
The Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersThe Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent Offers
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
 
Frontiers in Alternative Data : Techniques and Use Cases
Frontiers in Alternative Data : Techniques and Use CasesFrontiers in Alternative Data : Techniques and Use Cases
Frontiers in Alternative Data : Techniques and Use Cases
 

More from Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Dataconomy Media
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Dataconomy Media
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Dataconomy Media
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Dataconomy Media
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Dataconomy Media
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Dataconomy Media
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Dataconomy Media
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Dataconomy Media
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Dataconomy Media
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Dataconomy Media
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Dataconomy Media
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Dataconomy Media
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Dataconomy Media
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Dataconomy Media
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Dataconomy Media
 

More from Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 

Recently uploaded

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 

Recently uploaded (20)

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 

Big Data Athens 2019 v 4.0 I “Mining gold from terabytes of gaming data using Spark & AWS EMR" - Theodoros Michalareas

  • 1. Big Data in Action “Mining gold from terabytes of gaming data using Spark & AWS EMR” 29th May 2019, Big Data Athens v 4.0 #AutomagicallyIncreasingRevenue 1
  • 2. 2 Theodoros Michalareas, wappier CTO, lover of technology & all things geeky, startup advisor. Working on state-of-the-art audience management & marketing automation tools for mobile game publishers & online businesses. tm@wappier.com https://www.linkedin.com/in/theodorosmichalareas/
  • 3. • SaaS platform for Reward Programs • Next Best Actions (NBA) offers Recommendation Engine • Mobile game discovery & social networks Build Loyalty for Games & Businesses • Global Pricing optimization • Understand gamers/customers utility • Real-Time Bundling Increase Monetization • Predictive Analytics & ML Models • Real-Time Consumer Clustering • In-Depth Insights/Advanced Analytics • Multivariate Testing & Counterfactual Analysis Machine Learning / BI Analytics • Audience Builder: Dynamic Customers Segmentation • Predictive Consumer Attributes based on Real-Time Behavior Modeling & Forecasting • 3D / VR Data Visualization & Manipulation Visualizations & Audience Management What We Do – Intelligent Revenue Management
  • 4. Who We Are – 3.5 Years Startup Web-Based SaaS Platform (MEAN+cloud+native SDKs/apps for iOS/Android + Visualizations) >1m lines of code Big Data Infrastructure (spark-based) manage TBs of data per app/customer Machine Learning Framework: modeling, algorithm selection, evaluation Technology Worked with 10s of game publishers over the last 3.5 years Experienced in mobile marketing, building successful loyalty programs, customer success Skilled in Visual Design, Software Engineering, Big Data Engineering, BI, Data Science, Live Ops/marketing Know-How 3.5-years run, expanding from 40 to 60 headcount by the end of 2019 70% of the team in Engineering & Data Science Presence: US & Europe Engineering 1st company, running code ethos Team
  • 5. Our Mission We are transforming the way app developers and marketers maximize consumer revenue by using powerful AI that goes beyond marketing automation. Bring the sophistication in UA to Revenue Management Provide AI Technology to predict and influence player behavior Improve consumer LTV outside of core gameplay Let you focus on building the best app out there 5
  • 6. 6 Some of our Customers
  • 7. Maximize Gamers’ Lifetime Value Segment business customers into dynamic audiences based on their probability to churn/buy or their expected LTV Acquire Make your UA budget count Convert Retain More users into engaged players More players into payers Extent players lifetime Extent players monetary value Increase by 50% Increase by 30% 7 Finding Gold – a typical description of a game publisher request
  • 8. It’s Really Gold 8 Game Title Installs at 6 months Average Lifetime Value after 6 months Estimated revenue at 6 month Estimated incremental revenue with 5% additional LTV Rules of survival 55,728,640 $0.25 $13,932,160 $696,608 Knives Out 46,598,787 $1.66 $77,353,986 $3,867,699 Fortnite 16,106,159 $1.13 $18,199,960 $909,998 Clash Royale 113,076,241 $3.11 $351,667,110 $17,583,355 Puzzle Dragon 145,219 $6.78 $984,585 $49,229 Game of war 5,661,266 $5.51 $31,193,576 $1,559,679 Source: SensorTower
  • 9. Macroeconomic [GDP, Exchange Rate, Unemployment Rate, …] Microeconomic [Device Price, Housing/Rents, …] Game Market Statistics [Revenue, Growth, …] Mobile Tech Statistics [Smartphone Penetration, Android vs iOS, …] Device Context [Device, Device Price, Resolution, Platform, …] Game Context [Genre, Rating, DAUs/MAUs, F2B %, …] Temporal Elements [Seasonality, Trends, …] Gameplay Context [Level, Sessions per Day, Events per Day, Purchase History, …} Other Game History (same publisher) [Purchase History, Engagement, …] Game Data Small publisher Average publisher < 1GB daily < 1 y to reach 1TB < 10GB daily < 4 m to reach 1TB 9 < 50GB daily < 1 m to reach 1TB Large publisher BUT we need to mine TBs of data per game
  • 10. Assess • User reacts to personalized recommendation which results in 30- 50% performance increase Recommend • Platform computes and recommends user’s next best action: • Optimal Tactic • Optimal Channel • Optimal Timing Predict • Expected user LTV is X • Expected user next best tactic is Y (Loyalty AI Engine) • Expected user next best price offer is Z (Pricing AI Engine) Analyze • Data are being analyzed • User behavior is modeled: • retention curve, propensity to buy, probability to churn, LTV, … Track/Collect • User enters game • Data start being tracked • ML algorithms start being trained 10 Machine Learning Models We Use Finding Gold – Our Mining Methodology Revenue Regression Models Micro-Level Non-Linear Demand Estimation Models Behavioral Economics Adjustments (Psychological Pricing) Multi-Armed Bandit Optimization
  • 11. 11 1. Access more secondary and tertiary data on which to base analyses. These data are mainly structured in nature 2. Data is constantly updating and streaming 4. Broaden access for non- experts to Data Engineering 7. Machines are learning, enabling the results to contribute to the source data and inform future decisions 3. New tools are available that integrate analytics and enable data exploration and correlations 5. Enable multiple modeling combinations & iterations 6. Run time wappier platform enabled tactics Key Big Data Infrastructure: ML Workflow & Soft. Stack Define the problem Analyze data, synthesize Does data confirm hypotheses? Act Implement, Measure Review, Learn Primary data Secondary data YesNo Cycle time reduced to minutes 1 2 3 4 5 6 7 Staging Area/Data Lake Transform Extract/Load Develop multiple hypotheses
  • 12. 12 Challenges in Mining Gold / Optimizing Games Revenue Big Data Volumes per Customer/Volume A typical game can range between 10s of MB of data to 10s of GB of data daily – data science teams need a platform to support big data volumes Variable Number of Projects - Variety Variable number of projects /active publishers from small to large – data need to imported/staged and transformed as soon as we have access to them Cost of Exploration – Velocity/Veracity Initial exploratory phases involve process that need to access/process big data – Infrastructure needs to be able to grow to support different workloads Cooperation between Teams - Agility Allow different data science teams to work on different data sets based on security and auditing rules
  • 13. 13 Auto Scaling based on memory consumption AWS CloudTrail CloudWatch alarm IAM encrypted data permissions role Python (boto) bucket with objects bucket with objects bucket with objects OrchestrationBig Data Computing Data Storage Admin AWS EMR Data Lake Architecture
  • 15. Lessons Learned Unless you have <1TB to manage cloud-based solution is a must AWS EMR has a flexible deployment model that can be cost constrained You need to experiment to find the best policy to use EMR autoscaling for your SPARK workloadI 15 1 2 3
  • 16. 16 PS: We Are Hiring! https://wappier.com/join-us/ The team is growing!
  • 17. Thank you! #Automagically Increasing Revenue info@wappier.com +1 877 WAPPIER +44 20 7100 1736 www.wappier.com 17