SlideShare a Scribd company logo
1 of 36
Download to read offline
© 2019 Valassis Digital | PUBLIC
1
VALASSISDIGITAL.COM
Stacking Audience Models
Using an Ensemble Approach for Predictive
Modeling
Name: Susan Xia
Position: Data Scientist
Email: xias@valassis.com
Date: September 20th, 2019
© 2019 Valassis Digital | PUBLIC
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
2
The
Business
Problem
Page 3
Ensemble
Learning
Methods
Page 13
Stacking in
Ensemble
Learning
Page 19
Stacker
Optimization
Page 23
Stacking Audience Models
OVERVIEW
How can we combine our best models
to predict the rare event of an internet
user making a purchase online?
Implementation
and Deployment
Page 32
© 2019 Valassis Digital | PUBLIC
3
VALASSISDIGITAL.COM
THE BUSINESS
PROBLEM
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
4
Digital Advertising: How Does It Work?
INTRODUCTION
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
5
Browses a
website
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
6
Browses a
website
Ad ExchangeValassis Digital
120 billion requests daily
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
7
Browses a
website
Ad ExchangeValassis Digital
Win an auction
and serve an ad
120 billion requests daily
99.9% of < 10 ms
response time
Drops a cookie
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
8
Browses a
website
Ad ExchangeValassis Digital
Perform tracked
activities
Win an auction
and serve an ad
Drops a cookie
120 billion requests daily
among
1 billion
users
99.9% of < 10 ms
response time
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
9
Browses a
website
Ad ExchangeValassis Digital
Perform tracked
activities
Makes a purchase
Win an auction
and serve an ad
Drops a cookie
Conversion signal
comes back via a
pixel
120 billion requests daily
among
1 billion
users
99.9% of < 10 ms
response time
10
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
The Problem
BUSINESS CONTEXT
• In digital advertising, we run campaigns for our clients (usually different brands, and ad
agencies).
• The client has certain KPIs to drive (e.g., increase the purchase rate of its products).
• We receive credits if we served an ad to a user, and the user later makes a purchase of our
client’s product.
So we want to serve ads to people who are likely to make a purchase (or more generally, convert)
in the future. In order to achieve this, we need to:
Given the data we have about internet users, predict the likelihood of them converting.
11
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Predict Target Outcome
BUSINESS PROBLEM
Models Prediction
Will the user make a
purchase?
12
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Predict Target Outcome
BUSINESS PROBLEM
Additional requirements:
• Allow multiple contributors over time.
• Can generalize to other use cases.
© 2019 Valassis Digital | PUBLIC
13
VALASSISDIGITAL.COM
ENSEMBLE
LEARNING
METHODS
14
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Ensemble Learning
THE TOOL
• Combine multiple models to give a single prediction.
• Increase the diversity of the models or algorithms used.
• Has been shown to improve predictive power of the model.
Strength of Many
Combine the predictive power of multiple learners to obtain better
predictions than with one learner alone
15
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
• Reduces variance of the base learners.
• Bootstraps (sample with replacement) the
training data and train learners in parallel.
• Each learner is often trained on a random
subset of the training data.
• Learners vote on the outcome with weights.
Image from: Isied, Anwar & Tamimi, Hashem. (2015). Using Random Forest (RF) as a transfer learning classifier for detecting Error-Related Potential (ErrP) within the context of P300-Speller.
ENSEMBLE EXAMPLE
Bagging Example: Random Forest
Ensemble Learning
Bagging
16
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Ensemble Learning
• Reduces bias of the base learners.
• Build learners sequentially.
• Samples misclassified by the previous
learner get weighted more in
subsequent learners.
Image from https://blog.bigml.com/2017/03/14/introduction-to-boosted-trees/
ENSEMBLE EXAMPLE
Boosting
Boosting
17
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
• Build learners sequentially.
• Samples misclassified by the previous
learner get weighted more in
subsequent learners.
Images from Marsh, Brendan. (2016). Multivariate Analysis of the Vector Boson Fusion Higgs Boson, and
https://www.quora.com/How-would-you-explain-gradient-boosting-machine-learning-technique-in-no-more-than-300-words-to-non-science-major-college-students
BOOSTING
Boosting Examples
Adaptive Boosting Gradient Boosting
• Each model is fitted to predict the
residual error of the previous model.
• Samples misclassified by the previous
learner get weighted more in
subsequent learners.
18
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
• A full featured, efficient implementation of gradient boosted trees.
• Supports fast learning through distributed and randomized computing.
• Uses approximation algorithm to evaluate and find tree splits.
• Supports regularization and tree pruning.
• Can intelligently and efficiently handle missing values.
In practice, it has shown to be:
• good at predicting rare events.
• good at distinguishing signal from noise.
BOOSTING
Boosting Examples
XGBoost
© 2019 Valassis Digital | PUBLIC
19
VALASSISDIGITAL.COM
STACKING IN
ENSEMBLE
LEARNING
20
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Stacking
STACKING
Stacking is another ensemble method, where
• it has a layered structure.
• predictions from the models in the previous layer are used as inputs to the
sequential layer.
• new models will train on these inputs.
• it will produce a final result.
Image from http://supunsetunga.blogspot.com/
21
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Benefits of Stacking
• Stacking increases the diversity of the algorithms and models used.
• Stacking can decrease bias – rather than winner takes all, combine datasets to
decrease bias.
• Enables “parallel development”: allow each individual base model to be
developed and tuned by different individuals.
• We can capture different “categories” of features with different base models.
• Combining features could lead to very high dimensionality, if we were to use a
single big model.
STACKING
22
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Stacker Details
EXAMPLE
Predicts conversion
based on browsing
history
Predicts conversion
based on response
to previous ads
Predicts conversion
based on user’s
activity level
Predicts conversion
based on all base
classifiers
XGBoost
Classifier
Random Forest
Classifier
Logistic
Regression
Probability
of
Conversion
Probability
of
Conversion
Probability
of
Conversion
XGBoost
Classifier
Probability
of
Conversion
© 2019 Valassis Digital | PUBLIC
23
VALASSISDIGITAL.COM
STACKER
OPTIMIZATION
24
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
The Machine Learning Pipeline
MODEL IMPROVEMENT
Train
Train model
Validate
Select
hyperparameters
Test
Evaluate model
performance
How does the machine learning pipeline work in the case of a stacker?
25
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Stacker Tuning
Tune base models
We use K-fold cross validation.
Tune stacker model
• Each base model fits on train folds, predicts on test fold
• Predictions on each test fold are now new folds for stacker.
• Recreate train and test folds for the stacker using the new folds.
VALIDATION
Test Train Train Train Train Test Train Train Train Train
• Each model is tuned using randomized search in the hyperparameter space and K-fold cross validation.
• The folds of the base models and the stacker are different.
• After tuning the models are refitted on the enter dataset.
26
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
PITFALLS
Stacker Pitfalls
Stacking can lead to overfit. We take precautions against it:
• Make sure there is enough data to support stacking.
• Use some form of regularization (cap complexity related hyperparamters, use early
stopping, etc.)
• Use ”mutually orthogonal” base models: when we combine the predictions of different
models using stacking, it is desirable that the predictions made by the base models have
low correlation. This would suggest that the models are skillful but in different ways,
allowing the stacker to figure out how to get the best from each model for an improved
score.
27
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
METRICS
Evaluation Metrics
There are many metrics for evaluating the performance and effectiveness of
a model. The choices of the evaluation metrics should be based on the
problem the model is trying to solve.
Before we decide on the appropriate metrics to use, let us revisit the
problem…
28
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
Audience Overview
THE GOALS
For each ad campaign, there is a ranked list of users that we want to serve to associated. We call this
ranked list “an audience” of the campaign.
The probability of converting predicted by the stacking model provides a natural ranking of users.
For each audience, our goals are:
Goal 1: Put as many future converters as possible in the audience
• Serve ads to people who are likely to make a purchase.
Goal 2: Rank future converters higher than non-converters so that:
• People who are likely to make a purchase will be served first
• People who are likely to make a purchase will be priced higher
Based on the two goals, we chose our evaluation metrics to be: 1. Recall, 2. NDCG
29
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
What is NDCG?
NDCG
Normalized Discounted Cumulative Gain
Evaluates ability of ranked list to achieve desired result: relevant items (future converters)
at the top, irrelevant ones (non-converters) at the bottom
• Cumulative Gain: we get points for putting each relevant item in the list
• Discounted: we get fewer points for putting relevant items lower in the list
• Normalized: divide by the discounted cumulative gain of a perfect ranking so that we
can compare amongst lists of different lengths
Due to normalization, ranges from 0 (no converters in the list) to 1 (all the converters at the
top of the list).
30
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
What is NDCG?
THE MATH
Normalized Discounted Cumulative Gain
Note that !"#$ is always 1 or 0 in our case – either you are a converter (1) or not (0)
Cumulative Gain at position p
%&' = ∑$*+
'
!"#$
Discounted Cumulative Gain at position p
,%&' = ∑$*+
' -./01 2+
3456($8+)
Normalized Discounted Cumulative Gain at position p
:;<=
>:;<=
This is quite simply DCG normalized by the best score that list could receive for DCG.
So for our audiences, IDCG is DCG with the converters all at the top of the list.
Retains advantage of DCG so that converters higher in the list are worth more.
Adds advantage that we can compare amongst lists due to normalization.
31
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
NDCG Calculation
EXAMPLE
Score Rank Converter Addition to DCG Ideal addition to
DCG
1 0 0 1
2 1 0.63 0.63
3 1 0.5 0
4 0 0 0
5 0 0 0 NDCG =
DCG/IDCG
Sum 1.13 1.63 0.69
© 2019 Valassis Digital | PUBLIC
32
VALASSISDIGITAL.COM
IMPLEMENTATION
AND
DEPLOYMENT
33
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
SCALABILITY
Implementation and Deployment
Things to consider when implementing the model deployment pipeline:
• Scalable: need to be able to fit hundreds of models daily, and for each model, to predict
on 1 billion users.
• User-friendly: allow data scientists to develop models in Python.
• Generalizable: the process should be generalizable to other model fitting and
deployment use cases.
34
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
SCALABILITY
Our Solution
Hunch, an in-house library allowing functionality of python and speed of Scala
smaller scale
~ 200k users for each model
to fit
Model Fitting
Model
Scoring
larger scale
~ 1 billion users to
score for each
model
Serialize fitted
models in Python
De-serialize
in Scala
Hunch
It supports serialization of scikit-learn models and XGBoost models into Hunch representations, and de-
serialization of Hunch representations into Scala functions.
35
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
SCALABILITY
Performance
The de-serializers in Scala translate the Hunch representations to a Scala function which takes a
vector and emits class likelihoods. This provides huge improvements in speed over Python models.
Key Statistics
• We run 600 models daily to predict on our users, each user is a row of feature vector of 7k+
entries.
• We make an estimated number of 200 billion row evaluations.
• The models are run in batch and takes ~5.4 hours serially and ~18,700,000 (5300 hours) virtual
core seconds in Spark.
36
VALASSISDIGITAL.COM
© 2019 Valassis Digital | PUBLIC
THANK
YOU
© 2018 Valassis Digital |PUBLIC

More Related Content

Similar to 2019 Triangle Machine Learning Day - Stacking Audience Models -- Using an Ensemble Approach for Predictive Modeling - Susan Xia, September 20, 2019

GenZe Business Strategy
GenZe Business StrategyGenZe Business Strategy
GenZe Business StrategyCurt Rollison
 
Sem 1 : Marketing
Sem 1 : MarketingSem 1 : Marketing
Sem 1 : Marketingalok
 
Clickability Cut Costs Increase Revenue
Clickability Cut Costs Increase RevenueClickability Cut Costs Increase Revenue
Clickability Cut Costs Increase Revenuesrubinstein
 
Transforming Business with Smarter Analytics
Transforming Business with Smarter AnalyticsTransforming Business with Smarter Analytics
Transforming Business with Smarter AnalyticsCTI Group
 
Create Engaging Digital Commerce Experiences with IBM and CoreMedia
Create Engaging Digital Commerce Experiences with IBM and CoreMediaCreate Engaging Digital Commerce Experiences with IBM and CoreMedia
Create Engaging Digital Commerce Experiences with IBM and CoreMediaPerficient, Inc.
 
Power Saturday 2019 - D6 - Design thinking and innovation accounting
Power Saturday 2019 - D6 - Design thinking and innovation accountingPower Saturday 2019 - D6 - Design thinking and innovation accounting
Power Saturday 2019 - D6 - Design thinking and innovation accountingPowerSaturdayParis
 
Advertising Platform for the Open Internet
Advertising Platform for the Open Internet Advertising Platform for the Open Internet
Advertising Platform for the Open Internet FriederikeE
 
Javio Presentation : Light Color Theme
Javio Presentation : Light Color ThemeJavio Presentation : Light Color Theme
Javio Presentation : Light Color Themepunkl.
 
Javio Presentation : Dark Color Theme
Javio Presentation : Dark Color ThemeJavio Presentation : Dark Color Theme
Javio Presentation : Dark Color Themepunkl.
 
Cross Media Attribution
Cross Media AttributionCross Media Attribution
Cross Media AttributionLibrada Rivera
 
Deconstructing the Programmatic Ecosystem
Deconstructing the Programmatic EcosystemDeconstructing the Programmatic Ecosystem
Deconstructing the Programmatic EcosystemKatana Media
 
Transformation of Sales and Marketing by Rene van der Laan
Transformation of Sales and Marketing by Rene van der LaanTransformation of Sales and Marketing by Rene van der Laan
Transformation of Sales and Marketing by Rene van der LaanFima Rosyidah
 
Post Install Performance – How to Get ROI From Mobile_Stephen Rumbelow
Post Install Performance – How to Get ROI From Mobile_Stephen RumbelowPost Install Performance – How to Get ROI From Mobile_Stephen Rumbelow
Post Install Performance – How to Get ROI From Mobile_Stephen RumbelowPerformanceIN
 
Future of Supply Chain by CBA, Accenture and Coyote Logistics
Future of Supply Chain by CBA, Accenture and Coyote LogisticsFuture of Supply Chain by CBA, Accenture and Coyote Logistics
Future of Supply Chain by CBA, Accenture and Coyote LogisticsMaria Rey-Marston, PhD
 
New Requirements for Optimizing Your Modern B2B Customer Experience
New Requirements for Optimizing Your Modern B2B Customer ExperienceNew Requirements for Optimizing Your Modern B2B Customer Experience
New Requirements for Optimizing Your Modern B2B Customer ExperienceAcquia
 
Moura Presentation : Light Color Theme
Moura Presentation : Light Color ThemeMoura Presentation : Light Color Theme
Moura Presentation : Light Color Themepunkl.
 
Moura Presentation : Dark Color Theme
Moura Presentation : Dark Color ThemeMoura Presentation : Dark Color Theme
Moura Presentation : Dark Color Themepunkl.
 

Similar to 2019 Triangle Machine Learning Day - Stacking Audience Models -- Using an Ensemble Approach for Predictive Modeling - Susan Xia, September 20, 2019 (20)

GenZe Business Strategy
GenZe Business StrategyGenZe Business Strategy
GenZe Business Strategy
 
Sem 1 : Marketing
Sem 1 : MarketingSem 1 : Marketing
Sem 1 : Marketing
 
Clickability Cut Costs Increase Revenue
Clickability Cut Costs Increase RevenueClickability Cut Costs Increase Revenue
Clickability Cut Costs Increase Revenue
 
Transforming Business with Smarter Analytics
Transforming Business with Smarter AnalyticsTransforming Business with Smarter Analytics
Transforming Business with Smarter Analytics
 
Digital Transformation Frameworks
Digital Transformation FrameworksDigital Transformation Frameworks
Digital Transformation Frameworks
 
Create Engaging Digital Commerce Experiences with IBM and CoreMedia
Create Engaging Digital Commerce Experiences with IBM and CoreMediaCreate Engaging Digital Commerce Experiences with IBM and CoreMedia
Create Engaging Digital Commerce Experiences with IBM and CoreMedia
 
OMSN
OMSNOMSN
OMSN
 
Power Saturday 2019 - D6 - Design thinking and innovation accounting
Power Saturday 2019 - D6 - Design thinking and innovation accountingPower Saturday 2019 - D6 - Design thinking and innovation accounting
Power Saturday 2019 - D6 - Design thinking and innovation accounting
 
Advertising Platform for the Open Internet
Advertising Platform for the Open Internet Advertising Platform for the Open Internet
Advertising Platform for the Open Internet
 
Javio Presentation : Light Color Theme
Javio Presentation : Light Color ThemeJavio Presentation : Light Color Theme
Javio Presentation : Light Color Theme
 
Javio Presentation : Dark Color Theme
Javio Presentation : Dark Color ThemeJavio Presentation : Dark Color Theme
Javio Presentation : Dark Color Theme
 
Cross Media Attribution
Cross Media AttributionCross Media Attribution
Cross Media Attribution
 
Deconstructing the Programmatic Ecosystem
Deconstructing the Programmatic EcosystemDeconstructing the Programmatic Ecosystem
Deconstructing the Programmatic Ecosystem
 
Transformation of Sales and Marketing by Rene van der Laan
Transformation of Sales and Marketing by Rene van der LaanTransformation of Sales and Marketing by Rene van der Laan
Transformation of Sales and Marketing by Rene van der Laan
 
Post Install Performance – How to Get ROI From Mobile_Stephen Rumbelow
Post Install Performance – How to Get ROI From Mobile_Stephen RumbelowPost Install Performance – How to Get ROI From Mobile_Stephen Rumbelow
Post Install Performance – How to Get ROI From Mobile_Stephen Rumbelow
 
Future of Supply Chain by CBA, Accenture and Coyote Logistics
Future of Supply Chain by CBA, Accenture and Coyote LogisticsFuture of Supply Chain by CBA, Accenture and Coyote Logistics
Future of Supply Chain by CBA, Accenture and Coyote Logistics
 
New Requirements for Optimizing Your Modern B2B Customer Experience
New Requirements for Optimizing Your Modern B2B Customer ExperienceNew Requirements for Optimizing Your Modern B2B Customer Experience
New Requirements for Optimizing Your Modern B2B Customer Experience
 
Moura Presentation : Light Color Theme
Moura Presentation : Light Color ThemeMoura Presentation : Light Color Theme
Moura Presentation : Light Color Theme
 
Moura Presentation : Dark Color Theme
Moura Presentation : Dark Color ThemeMoura Presentation : Dark Color Theme
Moura Presentation : Dark Color Theme
 
Archaic to Advanced in Akron
Archaic to Advanced in AkronArchaic to Advanced in Akron
Archaic to Advanced in Akron
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 

Recently uploaded (20)

fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

2019 Triangle Machine Learning Day - Stacking Audience Models -- Using an Ensemble Approach for Predictive Modeling - Susan Xia, September 20, 2019

  • 1. © 2019 Valassis Digital | PUBLIC 1 VALASSISDIGITAL.COM Stacking Audience Models Using an Ensemble Approach for Predictive Modeling Name: Susan Xia Position: Data Scientist Email: xias@valassis.com Date: September 20th, 2019 © 2019 Valassis Digital | PUBLIC
  • 2. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 2 The Business Problem Page 3 Ensemble Learning Methods Page 13 Stacking in Ensemble Learning Page 19 Stacker Optimization Page 23 Stacking Audience Models OVERVIEW How can we combine our best models to predict the rare event of an internet user making a purchase online? Implementation and Deployment Page 32
  • 3. © 2019 Valassis Digital | PUBLIC 3 VALASSISDIGITAL.COM THE BUSINESS PROBLEM
  • 4. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 4 Digital Advertising: How Does It Work? INTRODUCTION
  • 5. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 5 Browses a website
  • 6. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 6 Browses a website Ad ExchangeValassis Digital 120 billion requests daily
  • 7. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 7 Browses a website Ad ExchangeValassis Digital Win an auction and serve an ad 120 billion requests daily 99.9% of < 10 ms response time Drops a cookie
  • 8. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 8 Browses a website Ad ExchangeValassis Digital Perform tracked activities Win an auction and serve an ad Drops a cookie 120 billion requests daily among 1 billion users 99.9% of < 10 ms response time
  • 9. VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC 9 Browses a website Ad ExchangeValassis Digital Perform tracked activities Makes a purchase Win an auction and serve an ad Drops a cookie Conversion signal comes back via a pixel 120 billion requests daily among 1 billion users 99.9% of < 10 ms response time
  • 10. 10 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC The Problem BUSINESS CONTEXT • In digital advertising, we run campaigns for our clients (usually different brands, and ad agencies). • The client has certain KPIs to drive (e.g., increase the purchase rate of its products). • We receive credits if we served an ad to a user, and the user later makes a purchase of our client’s product. So we want to serve ads to people who are likely to make a purchase (or more generally, convert) in the future. In order to achieve this, we need to: Given the data we have about internet users, predict the likelihood of them converting.
  • 11. 11 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Predict Target Outcome BUSINESS PROBLEM Models Prediction Will the user make a purchase?
  • 12. 12 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Predict Target Outcome BUSINESS PROBLEM Additional requirements: • Allow multiple contributors over time. • Can generalize to other use cases.
  • 13. © 2019 Valassis Digital | PUBLIC 13 VALASSISDIGITAL.COM ENSEMBLE LEARNING METHODS
  • 14. 14 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Ensemble Learning THE TOOL • Combine multiple models to give a single prediction. • Increase the diversity of the models or algorithms used. • Has been shown to improve predictive power of the model. Strength of Many Combine the predictive power of multiple learners to obtain better predictions than with one learner alone
  • 15. 15 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC • Reduces variance of the base learners. • Bootstraps (sample with replacement) the training data and train learners in parallel. • Each learner is often trained on a random subset of the training data. • Learners vote on the outcome with weights. Image from: Isied, Anwar & Tamimi, Hashem. (2015). Using Random Forest (RF) as a transfer learning classifier for detecting Error-Related Potential (ErrP) within the context of P300-Speller. ENSEMBLE EXAMPLE Bagging Example: Random Forest Ensemble Learning Bagging
  • 16. 16 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Ensemble Learning • Reduces bias of the base learners. • Build learners sequentially. • Samples misclassified by the previous learner get weighted more in subsequent learners. Image from https://blog.bigml.com/2017/03/14/introduction-to-boosted-trees/ ENSEMBLE EXAMPLE Boosting Boosting
  • 17. 17 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC • Build learners sequentially. • Samples misclassified by the previous learner get weighted more in subsequent learners. Images from Marsh, Brendan. (2016). Multivariate Analysis of the Vector Boson Fusion Higgs Boson, and https://www.quora.com/How-would-you-explain-gradient-boosting-machine-learning-technique-in-no-more-than-300-words-to-non-science-major-college-students BOOSTING Boosting Examples Adaptive Boosting Gradient Boosting • Each model is fitted to predict the residual error of the previous model. • Samples misclassified by the previous learner get weighted more in subsequent learners.
  • 18. 18 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC • A full featured, efficient implementation of gradient boosted trees. • Supports fast learning through distributed and randomized computing. • Uses approximation algorithm to evaluate and find tree splits. • Supports regularization and tree pruning. • Can intelligently and efficiently handle missing values. In practice, it has shown to be: • good at predicting rare events. • good at distinguishing signal from noise. BOOSTING Boosting Examples XGBoost
  • 19. © 2019 Valassis Digital | PUBLIC 19 VALASSISDIGITAL.COM STACKING IN ENSEMBLE LEARNING
  • 20. 20 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Stacking STACKING Stacking is another ensemble method, where • it has a layered structure. • predictions from the models in the previous layer are used as inputs to the sequential layer. • new models will train on these inputs. • it will produce a final result. Image from http://supunsetunga.blogspot.com/
  • 21. 21 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Benefits of Stacking • Stacking increases the diversity of the algorithms and models used. • Stacking can decrease bias – rather than winner takes all, combine datasets to decrease bias. • Enables “parallel development”: allow each individual base model to be developed and tuned by different individuals. • We can capture different “categories” of features with different base models. • Combining features could lead to very high dimensionality, if we were to use a single big model. STACKING
  • 22. 22 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Stacker Details EXAMPLE Predicts conversion based on browsing history Predicts conversion based on response to previous ads Predicts conversion based on user’s activity level Predicts conversion based on all base classifiers XGBoost Classifier Random Forest Classifier Logistic Regression Probability of Conversion Probability of Conversion Probability of Conversion XGBoost Classifier Probability of Conversion
  • 23. © 2019 Valassis Digital | PUBLIC 23 VALASSISDIGITAL.COM STACKER OPTIMIZATION
  • 24. 24 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC The Machine Learning Pipeline MODEL IMPROVEMENT Train Train model Validate Select hyperparameters Test Evaluate model performance How does the machine learning pipeline work in the case of a stacker?
  • 25. 25 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Stacker Tuning Tune base models We use K-fold cross validation. Tune stacker model • Each base model fits on train folds, predicts on test fold • Predictions on each test fold are now new folds for stacker. • Recreate train and test folds for the stacker using the new folds. VALIDATION Test Train Train Train Train Test Train Train Train Train • Each model is tuned using randomized search in the hyperparameter space and K-fold cross validation. • The folds of the base models and the stacker are different. • After tuning the models are refitted on the enter dataset.
  • 26. 26 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC PITFALLS Stacker Pitfalls Stacking can lead to overfit. We take precautions against it: • Make sure there is enough data to support stacking. • Use some form of regularization (cap complexity related hyperparamters, use early stopping, etc.) • Use ”mutually orthogonal” base models: when we combine the predictions of different models using stacking, it is desirable that the predictions made by the base models have low correlation. This would suggest that the models are skillful but in different ways, allowing the stacker to figure out how to get the best from each model for an improved score.
  • 27. 27 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC METRICS Evaluation Metrics There are many metrics for evaluating the performance and effectiveness of a model. The choices of the evaluation metrics should be based on the problem the model is trying to solve. Before we decide on the appropriate metrics to use, let us revisit the problem…
  • 28. 28 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC Audience Overview THE GOALS For each ad campaign, there is a ranked list of users that we want to serve to associated. We call this ranked list “an audience” of the campaign. The probability of converting predicted by the stacking model provides a natural ranking of users. For each audience, our goals are: Goal 1: Put as many future converters as possible in the audience • Serve ads to people who are likely to make a purchase. Goal 2: Rank future converters higher than non-converters so that: • People who are likely to make a purchase will be served first • People who are likely to make a purchase will be priced higher Based on the two goals, we chose our evaluation metrics to be: 1. Recall, 2. NDCG
  • 29. 29 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC What is NDCG? NDCG Normalized Discounted Cumulative Gain Evaluates ability of ranked list to achieve desired result: relevant items (future converters) at the top, irrelevant ones (non-converters) at the bottom • Cumulative Gain: we get points for putting each relevant item in the list • Discounted: we get fewer points for putting relevant items lower in the list • Normalized: divide by the discounted cumulative gain of a perfect ranking so that we can compare amongst lists of different lengths Due to normalization, ranges from 0 (no converters in the list) to 1 (all the converters at the top of the list).
  • 30. 30 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC What is NDCG? THE MATH Normalized Discounted Cumulative Gain Note that !"#$ is always 1 or 0 in our case – either you are a converter (1) or not (0) Cumulative Gain at position p %&' = ∑$*+ ' !"#$ Discounted Cumulative Gain at position p ,%&' = ∑$*+ ' -./01 2+ 3456($8+) Normalized Discounted Cumulative Gain at position p :;<= >:;<= This is quite simply DCG normalized by the best score that list could receive for DCG. So for our audiences, IDCG is DCG with the converters all at the top of the list. Retains advantage of DCG so that converters higher in the list are worth more. Adds advantage that we can compare amongst lists due to normalization.
  • 31. 31 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC NDCG Calculation EXAMPLE Score Rank Converter Addition to DCG Ideal addition to DCG 1 0 0 1 2 1 0.63 0.63 3 1 0.5 0 4 0 0 0 5 0 0 0 NDCG = DCG/IDCG Sum 1.13 1.63 0.69
  • 32. © 2019 Valassis Digital | PUBLIC 32 VALASSISDIGITAL.COM IMPLEMENTATION AND DEPLOYMENT
  • 33. 33 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC SCALABILITY Implementation and Deployment Things to consider when implementing the model deployment pipeline: • Scalable: need to be able to fit hundreds of models daily, and for each model, to predict on 1 billion users. • User-friendly: allow data scientists to develop models in Python. • Generalizable: the process should be generalizable to other model fitting and deployment use cases.
  • 34. 34 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC SCALABILITY Our Solution Hunch, an in-house library allowing functionality of python and speed of Scala smaller scale ~ 200k users for each model to fit Model Fitting Model Scoring larger scale ~ 1 billion users to score for each model Serialize fitted models in Python De-serialize in Scala Hunch It supports serialization of scikit-learn models and XGBoost models into Hunch representations, and de- serialization of Hunch representations into Scala functions.
  • 35. 35 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC SCALABILITY Performance The de-serializers in Scala translate the Hunch representations to a Scala function which takes a vector and emits class likelihoods. This provides huge improvements in speed over Python models. Key Statistics • We run 600 models daily to predict on our users, each user is a row of feature vector of 7k+ entries. • We make an estimated number of 200 billion row evaluations. • The models are run in batch and takes ~5.4 hours serially and ~18,700,000 (5300 hours) virtual core seconds in Spark.
  • 36. 36 VALASSISDIGITAL.COM © 2019 Valassis Digital | PUBLIC THANK YOU © 2018 Valassis Digital |PUBLIC