SlideShare a Scribd company logo
1 of 20
Building Real-time
Analytics Engine with Kafka
and Spark for Mobile
Advertising
Mobile Advertising? - Social & Game
Authentic to Consumers Authentic to Entertainment Authentic to Engagement
Mobile Games
eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015.
People Spend A Lot of Time Gaming
3
Over 55 minutes a day on
average is spent playing mobile
games
Minutes Spent in Mobile
eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015.
Innovate Advertising as Reward Ads
● Free-to-Play (Freemium) App
● Only 2~5% users In-app-purchase
● Publisher can give “reward” on users who engaged to Ads
● Video + Game Economics + Reward
Mobile Video App Advertising
Advertiser
Pay on Video-View
Pub Paid
Tapjoy Profit
User Earn
Reward
Video to Install
Video Install
reward No reward
Video to Install to Event
Video Install
reward
No reward
Event
- Level N
- Registration
- In-app-purchase
- First Booking
Mobile Video App Advertising - Data Science
Video Views
Installs
Early
Retention
Life Time
Value
“Event”
Look-alike
Model
Real-time
Bidding
Engine
Advertiser’s Return
“Investment”
Building a Data Science Platform
Bigger in Scale
Faster
Serving
Smart and
Smarter !
Data Product
Tapjoy’s Data Platform
Algo Serving InfrastructureDatawarehousing
300,000 RPM throughput
Bidding & Targeting &
Personalization
<10 ms response time
20 TB daily addition
2.3 PB DUM
Cloud & On-Premise
In-house & SaaS
Batch based & real-time
The Logic Stack
Data
warehousing
HDFS / S3 / GS
Reporting
MPPs (BigQuery)
Algo Service
Batch + Streaming
Hadoop / Spark
• Collect data, set rules
• Reduce data friction
• Improve signal-to-noise ratio
• Model training & iteration
• Deliver business insights
• Driving data awareness
• Apply ideas to product (online)
• Serve model output
• Drive revenue
DataViz
A/B Testing
Data Viz
The Data Flow
Tapjoy’s Algo Service Engine (SOA)
● SOA (algo service) in Natty
● 320, 000 lines of Java
● 99% response time < 20 ms @ 200k - 400k RPM
Ad Request
A/B test classification
Main Algo & pre-filters
Apply Logic Pipe
Response (offer list)
Video Bidding
Targeting
Persona
Lookalike
...
Biz logic filters
Algo Service’s Data Components
Component What’s in there Purpose
Kafka Raw activity logs Everything starts here
Spark Streaming ETL ETL & Algo feature updates
Aerospike User Big Table (User DNA) Real-time k-v lookups. I.e LookALike
MemSQL Striped down raw user activity
data!!
● Device level real time
aggregations
● Hot data sink
● Real time reporting
Elasticsearch Aggregates or
Unstructured logs
Cube aggregates or fulltext search
Mobile Video App Advertising - Data Science
Video Views
Installs
Early
Retention
Life Time
Value
“event”
Look-alike
Model
Real-time
Bidding
Engine
Advertiser’s Return
“Investment”
Big Table / MemCache
Use Case 1 - Ad-Request Level Decision
Video Bid
# CVR
Spending History
max(views) > T(n)
...
User app usages
Kafka
+
Spark
Streaming
S - App 1
S - App 2
S - App 3
S - App ..
S - App N
Lamda Batch
Use Case 1 - Ad-Request Level Decision
Video Bid
Kafka
OR
Spark
Streaming
S - App
RAW
DATA
Use Case 1 - Ad-Request Level Decision
High throughput low latency queries querying 30 days device
level data which are streamed into MemSQL.
Does the calculations on the fly and serving as decision features
Reference Join Subquery
Reference Join
In Fact - One Fits All
Algo
Serving
Kafka
OR
Spark
Streaming
Real-Time Dashboard
Data Warehouse Hot Batch
Data
Sink
Hot
Batch
Realtime
Query
Realtime
Query
eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015.
Conclusion
2
❖ Mobile Advertising is all about knowing your audience
❖ Fast & Accurate data is key to Data Science as Service
❖ But, “Realtime” is a relative word
❖ Try to simplify moving parts when it come to streaming
➢ Difficult to debug
➢ Hard to backfill
❖ Generalized hot-data sink for stability and multi-purpose data storage
yohan.chin@tapjoy.com
robin.li@tapjoy.com

More Related Content

What's hot

Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkSingleStore
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast LearningSingleStore
 
Building an IoT Kafka Pipeline in Under 5 Minutes
Building an IoT Kafka Pipeline in Under 5 MinutesBuilding an IoT Kafka Pipeline in Under 5 Minutes
Building an IoT Kafka Pipeline in Under 5 MinutesSingleStore
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017SingleStore
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsSingleStore
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsSingleStore
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeSingleStore
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsSingleStore
 
Scaling Production Machine Learning Pipelines with Databricks
Scaling Production Machine Learning Pipelines with DatabricksScaling Production Machine Learning Pipelines with Databricks
Scaling Production Machine Learning Pipelines with DatabricksDatabricks
 
Life is but a Stream
Life is but a StreamLife is but a Stream
Life is but a StreamDatabricks
 
Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksDatabricks
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesSingleStore
 
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLBuilding Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLSingleStore
 
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisThe Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisSingleStore
 
Operationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksOperationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksDatabricks
 
Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale SingleStore
 
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Databricks
 
Spark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark Summit
 
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...Databricks
 

What's hot (20)

Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and Spark
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast Learning
 
Building an IoT Kafka Pipeline in Under 5 Minutes
Building an IoT Kafka Pipeline in Under 5 MinutesBuilding an IoT Kafka Pipeline in Under 5 Minutes
Building an IoT Kafka Pipeline in Under 5 Minutes
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive Analytics
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil Dahlke
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive Analytics
 
Scaling Production Machine Learning Pipelines with Databricks
Scaling Production Machine Learning Pipelines with DatabricksScaling Production Machine Learning Pipelines with Databricks
Scaling Production Machine Learning Pipelines with Databricks
 
Life is but a Stream
Life is but a StreamLife is but a Stream
Life is but a Stream
 
Zero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using HadoopZero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using Hadoop
 
Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber Attacks
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLBuilding Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
 
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisThe Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
 
Operationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksOperationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at Starbucks
 
Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale
 
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
 
Spark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony Baer
 
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
 

Viewers also liked

JustGiving – Serverless Data Pipelines, API, Messaging and Stream Processing
JustGiving – Serverless Data Pipelines,  API, Messaging and Stream ProcessingJustGiving – Serverless Data Pipelines,  API, Messaging and Stream Processing
JustGiving – Serverless Data Pipelines, API, Messaging and Stream ProcessingLuis Gonzalez
 
Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms Timothy Spann
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven productsLars Albertsson
 
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...confluent
 
Data Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache KafkaData Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache Kafkaconfluent
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Gwen (Chen) Shapira
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317Nan Zhu
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleIan Downard
 
INTRODUCING: CREATE PIPELINE
INTRODUCING: CREATE PIPELINEINTRODUCING: CREATE PIPELINE
INTRODUCING: CREATE PIPELINESingleStore
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthSingleStore
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionParis Carbone
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Timothy Spann
 

Viewers also liked (14)

JustGiving – Serverless Data Pipelines, API, Messaging and Stream Processing
JustGiving – Serverless Data Pipelines,  API, Messaging and Stream ProcessingJustGiving – Serverless Data Pipelines,  API, Messaging and Stream Processing
JustGiving – Serverless Data Pipelines, API, Messaging and Stream Processing
 
Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms Ingesting Drone Data into Big Data Platforms
Ingesting Drone Data into Big Data Platforms
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven products
 
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
 
Data Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache KafkaData Pipelines Made Simple with Apache Kafka
Data Pipelines Made Simple with Apache Kafka
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
 
OSGi with the Spring Framework
OSGi with the Spring FrameworkOSGi with the Spring Framework
OSGi with the Spring Framework
 
INTRODUCING: CREATE PIPELINE
INTRODUCING: CREATE PIPELINEINTRODUCING: CREATE PIPELINE
INTRODUCING: CREATE PIPELINE
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016
 

Similar to Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising

Mobile App Analytics. Why, How, What's new - Mar 2019
Mobile App Analytics. Why, How, What's new - Mar 2019Mobile App Analytics. Why, How, What's new - Mar 2019
Mobile App Analytics. Why, How, What's new - Mar 2019Dmitry Klymenko
 
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Amazon Web Services
 
Rewarded Video: Benefits and Best Practices
Rewarded Video: Benefits and Best PracticesRewarded Video: Benefits and Best Practices
Rewarded Video: Benefits and Best PracticesironSource
 
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...AppDynamics
 
Palsoft company information
Palsoft  company informationPalsoft  company information
Palsoft company informationKamal Patel
 
Splunk live london_grs
Splunk live london_grsSplunk live london_grs
Splunk live london_grsjenny_splunk
 
WSO2Con EU 2015: Reference Architecture for EDA
WSO2Con EU 2015: Reference Architecture for EDAWSO2Con EU 2015: Reference Architecture for EDA
WSO2Con EU 2015: Reference Architecture for EDAWSO2
 
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...Amazon Web Services
 
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...Data Con LA
 
Tizen Web App Development webinar
Tizen Web App Development webinarTizen Web App Development webinar
Tizen Web App Development webinarTizenExperts
 
Delivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsDelivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsGabrielle Knowles
 
SplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunk
 
SplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunk
 
Gtl Iphone &amp;amp; I Pad Capability Latest
Gtl Iphone &amp;amp; I Pad Capability   LatestGtl Iphone &amp;amp; I Pad Capability   Latest
Gtl Iphone &amp;amp; I Pad Capability LatestDipankar_Das
 
Transforming IT for Digital : Agility for Digital world
Transforming IT for Digital : Agility for Digital worldTransforming IT for Digital : Agility for Digital world
Transforming IT for Digital : Agility for Digital worldEkhlaque Bari
 
Digital transformation and customer care
Digital transformation and customer careDigital transformation and customer care
Digital transformation and customer careMiguel Mello
 
I Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical SessionsI Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical SessionsApigee | Google Cloud
 
ShepHertz Cloud Ecosystem for Apps
ShepHertz Cloud Ecosystem for AppsShepHertz Cloud Ecosystem for Apps
ShepHertz Cloud Ecosystem for AppsShepHertz
 
Un backend: pour tous vos objets connectés
Un backend: pour tous vos objets connectésUn backend: pour tous vos objets connectés
Un backend: pour tous vos objets connectésAmazon Web Services
 

Similar to Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising (20)

Mobile App Analytics. Why, How, What's new - Mar 2019
Mobile App Analytics. Why, How, What's new - Mar 2019Mobile App Analytics. Why, How, What's new - Mar 2019
Mobile App Analytics. Why, How, What's new - Mar 2019
 
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
Introduction to Real-time, Streaming Data and Amazon Kinesis: Streaming Data ...
 
Rewarded Video: Benefits and Best Practices
Rewarded Video: Benefits and Best PracticesRewarded Video: Benefits and Best Practices
Rewarded Video: Benefits and Best Practices
 
#TDXRecap India tour
#TDXRecap India tour#TDXRecap India tour
#TDXRecap India tour
 
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...
AppSphere 15 - Application Analytics helping DevOps with Data Driven Decision...
 
Palsoft company information
Palsoft  company informationPalsoft  company information
Palsoft company information
 
Splunk live london_grs
Splunk live london_grsSplunk live london_grs
Splunk live london_grs
 
WSO2Con EU 2015: Reference Architecture for EDA
WSO2Con EU 2015: Reference Architecture for EDAWSO2Con EU 2015: Reference Architecture for EDA
WSO2Con EU 2015: Reference Architecture for EDA
 
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
ABD214_Real-time User Insights for Mobile and Web Applications with Amazon Pi...
 
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
 
Tizen Web App Development webinar
Tizen Web App Development webinarTizen Web App Development webinar
Tizen Web App Development webinar
 
Delivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT OperationsDelivering New Visibility and Analytics for IT Operations
Delivering New Visibility and Analytics for IT Operations
 
SplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational IntelligenceSplunkLive Wellington 2015 - Operational Intelligence
SplunkLive Wellington 2015 - Operational Intelligence
 
SplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational IntelligenceSplunkLive Auckland - Operational Intelligence
SplunkLive Auckland - Operational Intelligence
 
Gtl Iphone &amp;amp; I Pad Capability Latest
Gtl Iphone &amp;amp; I Pad Capability   LatestGtl Iphone &amp;amp; I Pad Capability   Latest
Gtl Iphone &amp;amp; I Pad Capability Latest
 
Transforming IT for Digital : Agility for Digital world
Transforming IT for Digital : Agility for Digital worldTransforming IT for Digital : Agility for Digital world
Transforming IT for Digital : Agility for Digital world
 
Digital transformation and customer care
Digital transformation and customer careDigital transformation and customer care
Digital transformation and customer care
 
I Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical SessionsI Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical Sessions
 
ShepHertz Cloud Ecosystem for Apps
ShepHertz Cloud Ecosystem for AppsShepHertz Cloud Ecosystem for Apps
ShepHertz Cloud Ecosystem for Apps
 
Un backend: pour tous vos objets connectés
Un backend: pour tous vos objets connectésUn backend: pour tous vos objets connectés
Un backend: pour tous vos objets connectés
 

More from SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeSingleStore
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsSingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeSingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLSingleStore
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondSingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudSingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 

More from SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 

Recently uploaded

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Recently uploaded (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising

  • 1. Building Real-time Analytics Engine with Kafka and Spark for Mobile Advertising
  • 2. Mobile Advertising? - Social & Game Authentic to Consumers Authentic to Entertainment Authentic to Engagement Mobile Games
  • 3. eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015. People Spend A Lot of Time Gaming 3 Over 55 minutes a day on average is spent playing mobile games Minutes Spent in Mobile eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015.
  • 4. Innovate Advertising as Reward Ads ● Free-to-Play (Freemium) App ● Only 2~5% users In-app-purchase ● Publisher can give “reward” on users who engaged to Ads ● Video + Game Economics + Reward
  • 5. Mobile Video App Advertising Advertiser Pay on Video-View Pub Paid Tapjoy Profit User Earn Reward
  • 6. Video to Install Video Install reward No reward
  • 7. Video to Install to Event Video Install reward No reward Event - Level N - Registration - In-app-purchase - First Booking
  • 8. Mobile Video App Advertising - Data Science Video Views Installs Early Retention Life Time Value “Event” Look-alike Model Real-time Bidding Engine Advertiser’s Return “Investment”
  • 9. Building a Data Science Platform Bigger in Scale Faster Serving Smart and Smarter ! Data Product
  • 10. Tapjoy’s Data Platform Algo Serving InfrastructureDatawarehousing 300,000 RPM throughput Bidding & Targeting & Personalization <10 ms response time 20 TB daily addition 2.3 PB DUM Cloud & On-Premise In-house & SaaS Batch based & real-time
  • 11. The Logic Stack Data warehousing HDFS / S3 / GS Reporting MPPs (BigQuery) Algo Service Batch + Streaming Hadoop / Spark • Collect data, set rules • Reduce data friction • Improve signal-to-noise ratio • Model training & iteration • Deliver business insights • Driving data awareness • Apply ideas to product (online) • Serve model output • Drive revenue DataViz A/B Testing Data Viz
  • 13. Tapjoy’s Algo Service Engine (SOA) ● SOA (algo service) in Natty ● 320, 000 lines of Java ● 99% response time < 20 ms @ 200k - 400k RPM Ad Request A/B test classification Main Algo & pre-filters Apply Logic Pipe Response (offer list) Video Bidding Targeting Persona Lookalike ... Biz logic filters
  • 14. Algo Service’s Data Components Component What’s in there Purpose Kafka Raw activity logs Everything starts here Spark Streaming ETL ETL & Algo feature updates Aerospike User Big Table (User DNA) Real-time k-v lookups. I.e LookALike MemSQL Striped down raw user activity data!! ● Device level real time aggregations ● Hot data sink ● Real time reporting Elasticsearch Aggregates or Unstructured logs Cube aggregates or fulltext search
  • 15. Mobile Video App Advertising - Data Science Video Views Installs Early Retention Life Time Value “event” Look-alike Model Real-time Bidding Engine Advertiser’s Return “Investment”
  • 16. Big Table / MemCache Use Case 1 - Ad-Request Level Decision Video Bid # CVR Spending History max(views) > T(n) ... User app usages Kafka + Spark Streaming S - App 1 S - App 2 S - App 3 S - App .. S - App N Lamda Batch
  • 17. Use Case 1 - Ad-Request Level Decision Video Bid Kafka OR Spark Streaming S - App RAW DATA
  • 18. Use Case 1 - Ad-Request Level Decision High throughput low latency queries querying 30 days device level data which are streamed into MemSQL. Does the calculations on the fly and serving as decision features Reference Join Subquery Reference Join
  • 19. In Fact - One Fits All Algo Serving Kafka OR Spark Streaming Real-Time Dashboard Data Warehouse Hot Batch Data Sink Hot Batch Realtime Query Realtime Query
  • 20. eMarketer, “US Mobile Phone Content Usage Metrics, 2013-2019.” February, 2015. Conclusion 2 ❖ Mobile Advertising is all about knowing your audience ❖ Fast & Accurate data is key to Data Science as Service ❖ But, “Realtime” is a relative word ❖ Try to simplify moving parts when it come to streaming ➢ Difficult to debug ➢ Hard to backfill ❖ Generalized hot-data sink for stability and multi-purpose data storage yohan.chin@tapjoy.com robin.li@tapjoy.com