SlideShare a Scribd company logo
Online Learning
The Future of Audience Segmentation is Here
Kevin Lyons + Yakir Buskilla
Models that build profitable marketing audiences at scale...
Finding more of your best customers:
High-income business professional
The Modeling Process, simplified
2012 2015
30 - 40 models
levering billions of events
Creating 100 million + scores
over 1000 models
‘leveraging’ trillions of events
Creating 150 billion+ scores / day
The Challenge
In other words, we simply need ….
A system creates as many models as we want, when
we want them, that dynamically adapts in real-time
to changing conditions
○ Automatically creates, validates, ships, and
monitors models, with a capacity that scales
to 10s of thousands of models
The Opportunity
What we really need:
Online models evolve &
adapt over time, in
reaction to a changing
environment with each
and every event
Given a complete
data set, a batch
model is created in
entirety all at once
Introducing Online Learning
Batch Online Learning
Creation Evolution
large-scale
data storage
large-scale
data schelping
painful data
aggregation
lots of manual
everything
Harder to build models,
but easier to evaluate
limited data storage,
mostly for monitoring
event-level
data streams
light data
aggregation
lots of automatic
everything
Easier to build, but harder
to evaluate (& support)
Batch Models (Offline) vs. Online Learning
Online LearningBatch Models (Offline)
● Outperformed both L2 and Elastic Net
● Leverages small (‘micro’) batches
● Validates and monitors models in real time
● Alerts team when models are not behaving
Some Techno Mumbo Jumbo
Stochastic gradient descent with L1 regularization
eXelate.com @eXelate
Technical Solutions
How do we do it?
eXpresso Serving Cluster
10B events/day
260 nodes across
4 data centers
eXtream Modeling Cluster
160B models/day
85 nodes across
4 data centers
JGroups
Distributed
Messaging
Serving Layer
Online LearningBatch Models (Offline)
Batch
Predefined ratio
Predefined feature selection
One time Validation
Streaming
Downsampling
Automated feature selection
Ongoing data cleaning
Ongoing validation
The Online Learning Challenge
● All necessary data already exists in eXtream
● The cluster’s processing resources can be better utilized
● eXtream addresses most performance / scalability requirements
● Scoring mechanism already exists
eXtream as a Framework for Online Learning
Why it works...
Online Learning Flow
● Labeling Mechanism - customer defined target
audience
Events Classification
● Downsampling mechanism
● Burst tolerance
● Duplicate entries
Dataset Preparation
● Blacklist
● Whitelist
● Automatic Tuning
Features Selection
● Sliding window of recent events
● 60/40 not-converted/converted ratio
● Various accuracy metrics (lift, precision, recall, confusion matrix)
● Decide if the model is ready for making predictions
Model Validation
● Two phases (Scoring, Re-code)
● Scale vs Accuracy tradeoff
Predictions Mechanism
Scalability / Performance
Thousands of
Concurrent Models: High Throughput:
billions of training events per daytraining, validation, scoring
Why do we need it?
● Store the models in one common place
● Persistency
● Built-in replication
● Aerospike has built in limitation for object size - 1MB
○ Developed sharding mechanism for storing models on Aerospike
Scalability / Performance
Why do we need it?
Large object issue on Aerospike
The solution is Aerospike fast built-in replication
Cross Data Center Learning
● Low Volume Models
● Traffic Redirection
Monitoring- Why do we need it?
thousands of models
automatically created by users
some models won’t converge
Monitoring- Real Time
Monitoring- Aggregation
Monitoring- DS Bot
eXelate.com @eXelate
Case study
Working in action
● The ideal candidate for digital media expands and even subtly shifts in real time
● Real-time modeling tracks and reacts to these changes as they happen, with 2x CPA
improvement over a batch model
The Times, They Are A-Changin’
Market: Downgrading a country’s credit ratings
● Holiday shopping is very different from the rest of the year, particularly Cyber Monday
● AM changes in Eastern US are applied to the Pacific coast before the madness begins
Audiences: Cyber Monday frenzies
● … after the campaign starts, effecting the ideal audience
● No need to panic; modeled audience automagically adjust
Product: A product offering is revised
Scores of self-maintaining models that constantly adapt to our
ever changing conditions
Happiness Renewed...

More Related Content

What's hot

Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
SingleStore
 
Stream Analytics
Stream AnalyticsStream Analytics
Stream Analytics
Software Infrastructure
 
Machine Learning Deep Dive
Machine Learning Deep DiveMachine Learning Deep Dive
Machine Learning Deep Dive
Elasticsearch
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
Databricks
 
Big data on google platform dev fest presentation
Big data on google platform   dev fest presentationBig data on google platform   dev fest presentation
Big data on google platform dev fest presentationPrzemysław Pastuszka
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Eva Tse
 
The Big Bad Data
The Big Bad DataThe Big Bad Data
The Big Bad Data
Przemysław Pastuszka
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ Netflix
Data Con LA
 
Building Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemMLBuilding Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemML
Jen Aman
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
Shriya Arora
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive Analytics
SingleStore
 
DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)
DataStax
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
markgrover
 
Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]
AppFirst
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
SingleStore
 
Microsoft cosmos
Microsoft cosmosMicrosoft cosmos
Microsoft cosmos
Karthik Murugesan
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
Digital Vidya
 
Enterprise Performance Planning
Enterprise Performance PlanningEnterprise Performance Planning
Enterprise Performance Planning
Apigee | Google Cloud
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico Jacobs
ITProceed
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil Dahlke
SingleStore
 

What's hot (20)

Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Stream Analytics
Stream AnalyticsStream Analytics
Stream Analytics
 
Machine Learning Deep Dive
Machine Learning Deep DiveMachine Learning Deep Dive
Machine Learning Deep Dive
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
 
Big data on google platform dev fest presentation
Big data on google platform   dev fest presentationBig data on google platform   dev fest presentation
Big data on google platform dev fest presentation
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
 
The Big Bad Data
The Big Bad DataThe Big Bad Data
The Big Bad Data
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ Netflix
 
Building Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemMLBuilding Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemML
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive Analytics
 
DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Microsoft cosmos
Microsoft cosmosMicrosoft cosmos
Microsoft cosmos
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
 
Enterprise Performance Planning
Enterprise Performance PlanningEnterprise Performance Planning
Enterprise Performance Planning
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico Jacobs
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil Dahlke
 

Viewers also liked

Using druid for interactive count distinct queries at scale @ nmc
Using druid  for interactive count distinct queries at scale @ nmcUsing druid  for interactive count distinct queries at scale @ nmc
Using druid for interactive count distinct queries at scale @ nmc
Ido Shilon
 
Accelerating scale from startups to enterprise by Peter bakas
Accelerating scale from startups to enterprise by Peter bakasAccelerating scale from startups to enterprise by Peter bakas
Accelerating scale from startups to enterprise by Peter bakas
Ido Shilon
 
Blind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterBlind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forter
Ido Shilon
 
Deep learning at nmc devin jones
Deep learning at nmc devin jones Deep learning at nmc devin jones
Deep learning at nmc devin jones
Ido Shilon
 
Why ml and ai are the future of gaming david sachs @ tomobox
Why ml and ai are the future of gaming david sachs @ tomoboxWhy ml and ai are the future of gaming david sachs @ tomobox
Why ml and ai are the future of gaming david sachs @ tomobox
Ido Shilon
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
Ido Shilon
 
BDX 2016 - Tal sliwowicz @ taboola
BDX 2016 - Tal sliwowicz @ taboolaBDX 2016 - Tal sliwowicz @ taboola
BDX 2016 - Tal sliwowicz @ taboola
Ido Shilon
 
Druid - DevconTLV X
Druid - DevconTLV XDruid - DevconTLV X
Druid - DevconTLV X
Yakir Buskilla
 
BDX 2016 - Arnon rotem gal-oz @ appsflyer
BDX 2016 - Arnon rotem gal-oz @ appsflyerBDX 2016 - Arnon rotem gal-oz @ appsflyer
BDX 2016 - Arnon rotem gal-oz @ appsflyer
Ido Shilon
 
Micro apps across 3 continents using React js
Micro apps across 3 continents using React js Micro apps across 3 continents using React js
Micro apps across 3 continents using React js
Ido Shilon
 
BDX 2016 - Tzach zohar @ kenshoo
BDX 2016 - Tzach zohar  @ kenshooBDX 2016 - Tzach zohar  @ kenshoo
BDX 2016 - Tzach zohar @ kenshoo
Ido Shilon
 
Activate Tech and Media Outlook 2017
Activate Tech and Media Outlook 2017Activate Tech and Media Outlook 2017
Activate Tech and Media Outlook 2017
Activate
 

Viewers also liked (12)

Using druid for interactive count distinct queries at scale @ nmc
Using druid  for interactive count distinct queries at scale @ nmcUsing druid  for interactive count distinct queries at scale @ nmc
Using druid for interactive count distinct queries at scale @ nmc
 
Accelerating scale from startups to enterprise by Peter bakas
Accelerating scale from startups to enterprise by Peter bakasAccelerating scale from startups to enterprise by Peter bakas
Accelerating scale from startups to enterprise by Peter bakas
 
Blind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterBlind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forter
 
Deep learning at nmc devin jones
Deep learning at nmc devin jones Deep learning at nmc devin jones
Deep learning at nmc devin jones
 
Why ml and ai are the future of gaming david sachs @ tomobox
Why ml and ai are the future of gaming david sachs @ tomoboxWhy ml and ai are the future of gaming david sachs @ tomobox
Why ml and ai are the future of gaming david sachs @ tomobox
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
BDX 2016 - Tal sliwowicz @ taboola
BDX 2016 - Tal sliwowicz @ taboolaBDX 2016 - Tal sliwowicz @ taboola
BDX 2016 - Tal sliwowicz @ taboola
 
Druid - DevconTLV X
Druid - DevconTLV XDruid - DevconTLV X
Druid - DevconTLV X
 
BDX 2016 - Arnon rotem gal-oz @ appsflyer
BDX 2016 - Arnon rotem gal-oz @ appsflyerBDX 2016 - Arnon rotem gal-oz @ appsflyer
BDX 2016 - Arnon rotem gal-oz @ appsflyer
 
Micro apps across 3 continents using React js
Micro apps across 3 continents using React js Micro apps across 3 continents using React js
Micro apps across 3 continents using React js
 
BDX 2016 - Tzach zohar @ kenshoo
BDX 2016 - Tzach zohar  @ kenshooBDX 2016 - Tzach zohar  @ kenshoo
BDX 2016 - Tzach zohar @ kenshoo
 
Activate Tech and Media Outlook 2017
Activate Tech and Media Outlook 2017Activate Tech and Media Outlook 2017
Activate Tech and Media Outlook 2017
 

Similar to BDX 2016 - Kevin lyons & yakir buskilla @ eXelate

Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine Learning
Volha Banadyseva
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
Jan Bosch | Agile Product Development: From Hunch to Hard Data
Jan Bosch | Agile Product Development: From Hunch to Hard DataJan Bosch | Agile Product Development: From Hunch to Hard Data
Jan Bosch | Agile Product Development: From Hunch to Hard Data
Optimizely
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Inside Analysis
 
The New Model
The New ModelThe New Model
The New Model
David Kaiser
 
SparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time BiddingSparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time Bidding
Databricks
 
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRYOPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
wle-ss
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
vitm11
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Spark Summit
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders Meetup
Olivier Koch
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Artificial Intelligence high ROI case studies from around the world: approach...
Artificial Intelligence high ROI case studies from around the world: approach...Artificial Intelligence high ROI case studies from around the world: approach...
Artificial Intelligence high ROI case studies from around the world: approach...
Data Driven Innovation
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
Justin Basilico
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makers
zekeLabs Technologies
 
Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retail
Albert Y. C. Chen
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Amazon Web Services
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission Teams
Dashlane
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Kun Liu
 
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
InfluxData
 

Similar to BDX 2016 - Kevin lyons & yakir buskilla @ eXelate (20)

Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine Learning
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
 
Jan Bosch | Agile Product Development: From Hunch to Hard Data
Jan Bosch | Agile Product Development: From Hunch to Hard DataJan Bosch | Agile Product Development: From Hunch to Hard Data
Jan Bosch | Agile Product Development: From Hunch to Hard Data
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters Analytics
 
The New Model
The New ModelThe New Model
The New Model
 
SparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time BiddingSparkML: Easy ML Productization for Real-Time Bidding
SparkML: Easy ML Productization for Real-Time Bidding
 
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRYOPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
OPTIMIZATION OF THE END-TO-END OIL & GAS VALUE CHAIN FOR UPSTREAM INDUSTRY
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders Meetup
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Artificial Intelligence high ROI case studies from around the world: approach...
Artificial Intelligence high ROI case studies from around the world: approach...Artificial Intelligence high ROI case studies from around the world: approach...
Artificial Intelligence high ROI case studies from around the world: approach...
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Moving from BI to AI : For decision makers
Moving from BI to AI : For decision makersMoving from BI to AI : For decision makers
Moving from BI to AI : For decision makers
 
Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retail
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission Teams
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
 
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
 

Recently uploaded

guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 

Recently uploaded (20)

guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 

BDX 2016 - Kevin lyons & yakir buskilla @ eXelate

  • 1. Online Learning The Future of Audience Segmentation is Here Kevin Lyons + Yakir Buskilla
  • 2. Models that build profitable marketing audiences at scale... Finding more of your best customers: High-income business professional
  • 4. 2012 2015 30 - 40 models levering billions of events Creating 100 million + scores over 1000 models ‘leveraging’ trillions of events Creating 150 billion+ scores / day The Challenge
  • 5. In other words, we simply need ….
  • 6. A system creates as many models as we want, when we want them, that dynamically adapts in real-time to changing conditions ○ Automatically creates, validates, ships, and monitors models, with a capacity that scales to 10s of thousands of models The Opportunity What we really need:
  • 7. Online models evolve & adapt over time, in reaction to a changing environment with each and every event Given a complete data set, a batch model is created in entirety all at once Introducing Online Learning Batch Online Learning Creation Evolution
  • 8. large-scale data storage large-scale data schelping painful data aggregation lots of manual everything Harder to build models, but easier to evaluate limited data storage, mostly for monitoring event-level data streams light data aggregation lots of automatic everything Easier to build, but harder to evaluate (& support) Batch Models (Offline) vs. Online Learning Online LearningBatch Models (Offline)
  • 9. ● Outperformed both L2 and Elastic Net ● Leverages small (‘micro’) batches ● Validates and monitors models in real time ● Alerts team when models are not behaving Some Techno Mumbo Jumbo Stochastic gradient descent with L1 regularization
  • 11. eXpresso Serving Cluster 10B events/day 260 nodes across 4 data centers eXtream Modeling Cluster 160B models/day 85 nodes across 4 data centers JGroups Distributed Messaging Serving Layer
  • 12. Online LearningBatch Models (Offline) Batch Predefined ratio Predefined feature selection One time Validation Streaming Downsampling Automated feature selection Ongoing data cleaning Ongoing validation The Online Learning Challenge
  • 13. ● All necessary data already exists in eXtream ● The cluster’s processing resources can be better utilized ● eXtream addresses most performance / scalability requirements ● Scoring mechanism already exists eXtream as a Framework for Online Learning Why it works...
  • 15. ● Labeling Mechanism - customer defined target audience Events Classification
  • 16. ● Downsampling mechanism ● Burst tolerance ● Duplicate entries Dataset Preparation
  • 17. ● Blacklist ● Whitelist ● Automatic Tuning Features Selection
  • 18. ● Sliding window of recent events ● 60/40 not-converted/converted ratio ● Various accuracy metrics (lift, precision, recall, confusion matrix) ● Decide if the model is ready for making predictions Model Validation
  • 19. ● Two phases (Scoring, Re-code) ● Scale vs Accuracy tradeoff Predictions Mechanism
  • 20. Scalability / Performance Thousands of Concurrent Models: High Throughput: billions of training events per daytraining, validation, scoring
  • 21. Why do we need it? ● Store the models in one common place ● Persistency ● Built-in replication ● Aerospike has built in limitation for object size - 1MB ○ Developed sharding mechanism for storing models on Aerospike Scalability / Performance Why do we need it? Large object issue on Aerospike
  • 22. The solution is Aerospike fast built-in replication Cross Data Center Learning ● Low Volume Models ● Traffic Redirection
  • 23. Monitoring- Why do we need it? thousands of models automatically created by users some models won’t converge
  • 28. ● The ideal candidate for digital media expands and even subtly shifts in real time ● Real-time modeling tracks and reacts to these changes as they happen, with 2x CPA improvement over a batch model The Times, They Are A-Changin’ Market: Downgrading a country’s credit ratings ● Holiday shopping is very different from the rest of the year, particularly Cyber Monday ● AM changes in Eastern US are applied to the Pacific coast before the madness begins Audiences: Cyber Monday frenzies ● … after the campaign starts, effecting the ideal audience ● No need to panic; modeled audience automagically adjust Product: A product offering is revised
  • 29. Scores of self-maintaining models that constantly adapt to our ever changing conditions Happiness Renewed...