SlideShare a Scribd company logo
1 of 41
Download to read offline
Webpage Personalization and User Profiling

Liang Zhang
Computational Advertising Workshop at SAMSI
Aug 8, 2012
Recruiting Solutions
Personalized Webpage Is Everywhere
Personalized Webpage Is Everywhere
Common Properties of Web Personalization Problem
§  One or multiple metrics to optimize
– 
– 
– 
– 
– 

Click Through Rate (CTR) (focus of this talk)
Revenue per impression
Time spent on the landing page
Ad conversion rate
…

§  Large scale data
–  Map-Reduce to solve the problem!

§  Sparsity
§  Cold-start
–  User features: Age, gender, position, industry, …
–  Item features: Category, key words, creator features, …
Scope of This Talk
§  CTR prediction for a user on an item
§  Assumptions:
–  There are sufficient data per item to estimate per-item model
–  Serving bias and positional bias are removed by randomly
serving scheme
–  Item popularities are quite dynamic and have to be estimated in
real-time fashion

§  Examples:
–  Yahoo! Front page Today module
–  Linkedin Today module
Online Logistic Regression (OLR)
§  User i with feature xi, article j
§  Binary response y (click/non-click)
§ 
§ 
§  Prior
§  Using Laplace approximation or variational Bayesian
methods to obtain posterior
§  New prior
§  Can approximate

and

as diagonal for high dim xi
User Features for OLR
§  Age, gender, industry, job position for login users
§  General behavior targeting (BT) features
–  Music? Finance? Politics?

§  User profiles from historical view/click behavior on
previous items in the data, e.g.
–  Item-profile: use previously clicked item ids as the user profile
–  Category-profile: use item category affinity score as profile. The
score can be simply user’s historical CTR on each category.
–  Are there better ways to generate user profiles?
–  Yes! By matrix factorization!
Generalized Matrix Factorization (GMF) Framework
§ 

Global
Features

User
effect

Item User
Item
effect factors factors
Bell et al. (2007)
Regression Priors
§ 

User covariates

Item covariates
§  g(·), h(·), G(·), H(·) can be any regression functions
§  Agarwal and Chen (KDD 2009); Zhang et al. (RecSys
2011)
§ 
Different Types of Prior Regression Models
§  Zero prior mean
–  Bilinear random effects (BIRE)

§  Linear regression
–  Simple regression (RLFM)
–  Lasso penalty (LASSO)

§  Tree Models
– 
– 
– 
– 

Recursive partitioning (RP)
Random forests (RF)
Gradient boosting machines (GB)
Bayesian additive regression trees (BART)
Model Fitting Using MCEM
§ 
§ 
§ 
§ 

Monte Carlo EM (Booth and Hobert 1999)
Let
Let
E Step:
–  Obtain N samples of conditional posterior

§  M Step:
Handling Binary Responses
§  Gaussian responses:
have closed form
§  Binary responses + Logistic: no longer closed form
§  Variational approximation (VAR)
§  Adaptive rejection sampling (ARS)
Simulation Study
§  10 simulated data sets, 100K samples for both training
and test
§  1000 users and 1000 items in training
§  Extra 500 new users and 500 new items in test + old
users/items
§  For each user/item, 200 covariates, only 10 useful
§  Construct non-linear regression model from 20 Gaussian
functions for simulating α, β, u and v following Friedman
(2001)
MovieLens 1M Data Set
§  1M ratings
§  6040 users
§  3706 movies
§  Sort by time, first 75% training, last 25% test
§  A lot of new users in the test data set
§  User features: Age, gender, occupation, zip code
§  Item features: Movie genre
Performance Comparison
However…
§  We are working with very large scale data sets!
§  Parallel matrix factorization methods using Map-Reduce
has to be developed!
§  Khanna et al. 2012 Technical report
Model Fitting Using MCEM (Single Machine)
§ 
§ 
§ 
§ 

Monte Carlo EM (Booth and Hobert 1999)
Let
Let
E Step:
–  Obtain N samples of conditional posterior

§  M Step:
Parallel Matrix Factorization
§  Partition data into m partitions
§  For each partition
run MCEM algorithm and
get
.
§ 
§  Ensemble runs: for k = 1, … , n
–  Repartition data into m partitions with a new seed
–  Run E-step only job for each partition given

§  Average over user/item factors for all partitions and k’s to
obtain the final estimate
Key Points
§  Partitioning is tricky!
–  By events? By items? By users?

§  Empirically, “divide and conquer” + average over
obtain work well!

to

§  Ensemble runs: After obtained , we run n E-step-only
jobs and take average, for each job using a different useritem mix.
Identifiability Issues
§  Same log-likelihood can be achieved by
–  g ( ) = g ( ) + r, h ( ) = h ( ) – r
§  Center α, β, u to zero-mean every E-step

–  u = -u, v = -v
§  Constrain v to be positive

–  Switching u.1, v.1 with u.2, v.2
§  ui ~ N(G(xi) , I), vj ~ N(H(xj), λI)
§  Constraint: Diagonal entries λ1 >= λ2 >= …
Matrix Factorization For User Profile
§  Offline user profile building period, obtain the user factor
for user i
§  Online modeling using OLR
–  If a user has a profile (warm-start), use
–  If not (cold-start), use

as the user feature

as the user feature
Offline Evaluation Metric Related to Clicks
§  For model M and J live items (articles) at any time

§  If M = random (constant) model
E[S(M)] = #clicks

§  Unbiased estimate of expected total clicks (Langford et
al. 2008)
Experiments
§  Yahoo! Front Page Today Module data
§  Data for building user profile: 8M users with at least 10
clicks (heavy users) in June 2011, 1B events
§  Data for training and testing OLR model: Random served
data with 2.4M clicks in July 2011
§  Heavy users contributed around 30% of clicks
§  User feature for OLR:
– 
– 
– 
– 

Intercept-only (MOST POPULAR)
124 Behavior targeting features (BT-ONLY)
BT + top 1000 clicked article ids (ITEM-PROFILE)
BT + user profile with CTR on 43 binary content categories
(CATEGORY-PROFILE)
–  BT + profiles from matrix factorization models
Click Lift Performance For Different User Profiles
Click Lift vs #Clicks in Training Data
User Profile Model with Graphical Lasso (UPG)
§ 

User-item affinity
§ 
§  Unknown Σ represents item-item similarity
§  Yet another way to model CTR
§  Agarwal, Zhang and Mazumder (2011), Annals of Applied
Statistics
Covariance Matrix Regularization
§  Σ need to be regularized, especially for high-dimensional
problems (e.g. thousands of items)
§  Prior log-likelihood without constant (Ni=#users)

Regularize the
precision matrix Ω
§  k=1, Graphical lasso problem (Banerjee et al. 2007,
Friedman et al. 2007)
Model Fitting For UPG
§  E Step:
–  For each user i, obtain posterior

§  M Step

§  Let
be the sample covariance matrix for
graphical Lasso
Fitted Precision Matrix
Real World Data from Yahoo! PA
§  51 items
§  Training data
–  5M binary observations (click/non-click)
–  140K users

§  Test data
–  Random bucket
–  528K binary observations

§  User features: Age, gender, behavior targeting
Fitted Precision Matrix
What To Do When Not Enough Data Per Item?
§  Example:
–  CTR prediction for ad creatives/campaigns

§  User i with feature xi
§  Item j with feature xj
§ 

Offline Model Online Model
Component
Component

Agarwal et al. (KDD 2010)
Large Scale Logistic Regression
§  Naïve:
–  Partition the data and run logistic regression for each partition
–  Take the mean of the learned coefficients
–  Problem: Not guaranteed to converge to the model from single
machine!

§  Alternating Direction Method of Multipliers (ADMM)
–  Boyd et al. 2011
–  Set up a constraint that each partition’s coefficient = global
consensus
–  Solve the optimization problem using Lagrange Multipliers

§  All-Reduce from Vowpal Wabbit (VW), Langford et al.
–  Reducers talk to each other so that precise gradient can be
computed by aggregating all computations from each partition
(reducer).
Ongoing Work at LinkedIn and Future Challenges
§  Large scale statistical models for ad creative CTR
prediction and ad creative ranking
§  Explore-exploit for better ad serving strategy
§  Incorporating social network signals into user profile (for
cold start)
Conclusion
§  Generalized Matrix Factorization (GMF) framework to
handle cold-start, feature selection and non-linearity
simultaneously
§  User factors from Parallelized GMF can serve as user
profile for OLR, which gives state-of-the-art performance
§  A new way to model item-item similarity for CTR
prediction
Thank You!
Our Open Source Package for
matrix factorization models:
https://github.com/yahoo/Latent-Factor-Models
Questions or feedback: liang.zhang.stat@gmail.com
Bibliography
Agarwal, D. and Chen, B. (2009). Regression-based latent factor models. In Proceedings
of the 15th ACM SIGKDD international conference on Knowledge discovery and data
mining, 19–28. ACM.
Agarwal, D., Chen, B., and Elango, P. (2010). Fast online learning through offline initialization for time-sensitive recommendation. In Proceedings of the 16th ACM SIGKDD
international conference on Knowledge discovery and data mining, 703–712. ACM.
Agarwal, D., Zhang, L., and Mazumder, R. (2011). Modeling item–item similarities for
personalized recommendations on Yahoo! front page. The Annals of Applied Statistics 5,
3, 1839–1875.
Bell, R., Koren, Y., and Volinsky, C. (2007). Modeling relationships at multiple scales to
improve accuracy of large recommender systems. In Proceedings of the 13th ACM
SIGKDD international conference on Knowledge discovery and data mining, 95–104.
ACM.
Zhang, L., Agarwal, D., and Chen, B. (2011). Generalizing matrix factorization through
flexible regression priors. In Proceedings of the fifth ACM conference on Recommender
systems, 13–20. ACM.
R Khanna, L Zhang, D Agarwal and Chen, B. (2012). Parallel Matrix Factorization for
Binary Response. In Arxiv.org.

More Related Content

What's hot

Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsKrishna Sankar
 
Workshop - Introduction to Machine Learning with R
Workshop - Introduction to Machine Learning with RWorkshop - Introduction to Machine Learning with R
Workshop - Introduction to Machine Learning with RShirin Elsinghorst
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Fine –grained interest matching for neural news recommendation
Fine –grained interest matching   for neural news recommendationFine –grained interest matching   for neural news recommendation
Fine –grained interest matching for neural news recommendationtaeseon ryu
 
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"Fwdays
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira
 
Recommender Systems Tutorial (Part 3) -- Online Components
Recommender Systems Tutorial (Part 3) -- Online ComponentsRecommender Systems Tutorial (Part 3) -- Online Components
Recommender Systems Tutorial (Part 3) -- Online ComponentsBee-Chung Chen
 
Exploratory data analysis using xgboost package in R
Exploratory data analysis using xgboost package in RExploratory data analysis using xgboost package in R
Exploratory data analysis using xgboost package in RSatoshi Kato
 
Recommender Systems Tutorial (Part 2) -- Offline Components
Recommender Systems Tutorial (Part 2) -- Offline ComponentsRecommender Systems Tutorial (Part 2) -- Offline Components
Recommender Systems Tutorial (Part 2) -- Offline ComponentsBee-Chung Chen
 
Recommender Systems Tutorial (Part 1) -- Introduction
Recommender Systems Tutorial (Part 1) -- IntroductionRecommender Systems Tutorial (Part 1) -- Introduction
Recommender Systems Tutorial (Part 1) -- IntroductionBee-Chung Chen
 
PyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningPyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningRebecca Bilbro
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionJaroslaw Szymczak
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesScott Clark
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black BoxRebecca Bilbro
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetCrossing Minds
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 

What's hot (20)

Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
Workshop - Introduction to Machine Learning with R
Workshop - Introduction to Machine Learning with RWorkshop - Introduction to Machine Learning with R
Workshop - Introduction to Machine Learning with R
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Fine –grained interest matching for neural news recommendation
Fine –grained interest matching   for neural news recommendationFine –grained interest matching   for neural news recommendation
Fine –grained interest matching for neural news recommendation
 
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Recommender Systems Tutorial (Part 3) -- Online Components
Recommender Systems Tutorial (Part 3) -- Online ComponentsRecommender Systems Tutorial (Part 3) -- Online Components
Recommender Systems Tutorial (Part 3) -- Online Components
 
Exploratory data analysis using xgboost package in R
Exploratory data analysis using xgboost package in RExploratory data analysis using xgboost package in R
Exploratory data analysis using xgboost package in R
 
Recommender Systems Tutorial (Part 2) -- Offline Components
Recommender Systems Tutorial (Part 2) -- Offline ComponentsRecommender Systems Tutorial (Part 2) -- Offline Components
Recommender Systems Tutorial (Part 2) -- Offline Components
 
Recommender Systems Tutorial (Part 1) -- Introduction
Recommender Systems Tutorial (Part 1) -- IntroductionRecommender Systems Tutorial (Part 1) -- Introduction
Recommender Systems Tutorial (Part 1) -- Introduction
 
PyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningPyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine Learning
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black Box
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 

Similar to Webpage Personalization and User Profiling

Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with sparkModern Data Stack France
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsFrancesca Lazzeri, PhD
 
A General Overview of Machine Learning
A General Overview of Machine LearningA General Overview of Machine Learning
A General Overview of Machine LearningAshish Sharma
 
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveScaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveJune Andrews
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in productionStepan Pushkarev
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningSanghamitra Deb
 
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksMarkus Scheidgen
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Deep feature synthesis
Deep feature synthesisDeep feature synthesis
Deep feature synthesisDurra Sahtout
 
Introduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java DevelopersIntroduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java DevelopersZoran Sevarac, PhD
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixXavier Amatriain
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013MLconf
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueXavier Amatriain
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 

Similar to Webpage Personalization and User Profiling (20)

Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
 
kdd2015
kdd2015kdd2015
kdd2015
 
A General Overview of Machine Learning
A General Overview of Machine LearningA General Overview of Machine Learning
A General Overview of Machine Learning
 
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will loveScaling & Transforming Stitch Fix's Visibility into What Folks will love
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
 
Data ops: Machine Learning in production
Data ops: Machine Learning in productionData ops: Machine Learning in production
Data ops: Machine Learning in production
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for Benchmarks
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Deep feature synthesis
Deep feature synthesisDeep feature synthesis
Deep feature synthesis
 
Introduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java DevelopersIntroduction to Machine Learning for Java Developers
Introduction to Machine Learning for Java Developers
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Webpage Personalization and User Profiling

  • 1. Webpage Personalization and User Profiling Liang Zhang Computational Advertising Workshop at SAMSI Aug 8, 2012 Recruiting Solutions
  • 4. Common Properties of Web Personalization Problem §  One or multiple metrics to optimize –  –  –  –  –  Click Through Rate (CTR) (focus of this talk) Revenue per impression Time spent on the landing page Ad conversion rate … §  Large scale data –  Map-Reduce to solve the problem! §  Sparsity §  Cold-start –  User features: Age, gender, position, industry, … –  Item features: Category, key words, creator features, …
  • 5. Scope of This Talk §  CTR prediction for a user on an item §  Assumptions: –  There are sufficient data per item to estimate per-item model –  Serving bias and positional bias are removed by randomly serving scheme –  Item popularities are quite dynamic and have to be estimated in real-time fashion §  Examples: –  Yahoo! Front page Today module –  Linkedin Today module
  • 6. Online Logistic Regression (OLR) §  User i with feature xi, article j §  Binary response y (click/non-click) §  §  §  Prior §  Using Laplace approximation or variational Bayesian methods to obtain posterior §  New prior §  Can approximate and as diagonal for high dim xi
  • 7. User Features for OLR §  Age, gender, industry, job position for login users §  General behavior targeting (BT) features –  Music? Finance? Politics? §  User profiles from historical view/click behavior on previous items in the data, e.g. –  Item-profile: use previously clicked item ids as the user profile –  Category-profile: use item category affinity score as profile. The score can be simply user’s historical CTR on each category. –  Are there better ways to generate user profiles? –  Yes! By matrix factorization!
  • 8. Generalized Matrix Factorization (GMF) Framework §  Global Features User effect Item User Item effect factors factors Bell et al. (2007)
  • 9. Regression Priors §  User covariates Item covariates §  g(·), h(·), G(·), H(·) can be any regression functions §  Agarwal and Chen (KDD 2009); Zhang et al. (RecSys 2011) § 
  • 10. Different Types of Prior Regression Models §  Zero prior mean –  Bilinear random effects (BIRE) §  Linear regression –  Simple regression (RLFM) –  Lasso penalty (LASSO) §  Tree Models –  –  –  –  Recursive partitioning (RP) Random forests (RF) Gradient boosting machines (GB) Bayesian additive regression trees (BART)
  • 11. Model Fitting Using MCEM §  §  §  §  Monte Carlo EM (Booth and Hobert 1999) Let Let E Step: –  Obtain N samples of conditional posterior §  M Step:
  • 12. Handling Binary Responses §  Gaussian responses: have closed form §  Binary responses + Logistic: no longer closed form §  Variational approximation (VAR) §  Adaptive rejection sampling (ARS)
  • 13. Simulation Study §  10 simulated data sets, 100K samples for both training and test §  1000 users and 1000 items in training §  Extra 500 new users and 500 new items in test + old users/items §  For each user/item, 200 covariates, only 10 useful §  Construct non-linear regression model from 20 Gaussian functions for simulating α, β, u and v following Friedman (2001)
  • 14.
  • 15. MovieLens 1M Data Set §  1M ratings §  6040 users §  3706 movies §  Sort by time, first 75% training, last 25% test §  A lot of new users in the test data set §  User features: Age, gender, occupation, zip code §  Item features: Movie genre
  • 17.
  • 18. However… §  We are working with very large scale data sets! §  Parallel matrix factorization methods using Map-Reduce has to be developed! §  Khanna et al. 2012 Technical report
  • 19. Model Fitting Using MCEM (Single Machine) §  §  §  §  Monte Carlo EM (Booth and Hobert 1999) Let Let E Step: –  Obtain N samples of conditional posterior §  M Step:
  • 20. Parallel Matrix Factorization §  Partition data into m partitions §  For each partition run MCEM algorithm and get . §  §  Ensemble runs: for k = 1, … , n –  Repartition data into m partitions with a new seed –  Run E-step only job for each partition given §  Average over user/item factors for all partitions and k’s to obtain the final estimate
  • 21. Key Points §  Partitioning is tricky! –  By events? By items? By users? §  Empirically, “divide and conquer” + average over obtain work well! to §  Ensemble runs: After obtained , we run n E-step-only jobs and take average, for each job using a different useritem mix.
  • 22. Identifiability Issues §  Same log-likelihood can be achieved by –  g ( ) = g ( ) + r, h ( ) = h ( ) – r §  Center α, β, u to zero-mean every E-step –  u = -u, v = -v §  Constrain v to be positive –  Switching u.1, v.1 with u.2, v.2 §  ui ~ N(G(xi) , I), vj ~ N(H(xj), λI) §  Constraint: Diagonal entries λ1 >= λ2 >= …
  • 23. Matrix Factorization For User Profile §  Offline user profile building period, obtain the user factor for user i §  Online modeling using OLR –  If a user has a profile (warm-start), use –  If not (cold-start), use as the user feature as the user feature
  • 24. Offline Evaluation Metric Related to Clicks §  For model M and J live items (articles) at any time §  If M = random (constant) model E[S(M)] = #clicks §  Unbiased estimate of expected total clicks (Langford et al. 2008)
  • 25. Experiments §  Yahoo! Front Page Today Module data §  Data for building user profile: 8M users with at least 10 clicks (heavy users) in June 2011, 1B events §  Data for training and testing OLR model: Random served data with 2.4M clicks in July 2011 §  Heavy users contributed around 30% of clicks §  User feature for OLR: –  –  –  –  Intercept-only (MOST POPULAR) 124 Behavior targeting features (BT-ONLY) BT + top 1000 clicked article ids (ITEM-PROFILE) BT + user profile with CTR on 43 binary content categories (CATEGORY-PROFILE) –  BT + profiles from matrix factorization models
  • 26. Click Lift Performance For Different User Profiles
  • 27. Click Lift vs #Clicks in Training Data
  • 28. User Profile Model with Graphical Lasso (UPG) §  User-item affinity §  §  Unknown Σ represents item-item similarity §  Yet another way to model CTR §  Agarwal, Zhang and Mazumder (2011), Annals of Applied Statistics
  • 29. Covariance Matrix Regularization §  Σ need to be regularized, especially for high-dimensional problems (e.g. thousands of items) §  Prior log-likelihood without constant (Ni=#users) Regularize the precision matrix Ω §  k=1, Graphical lasso problem (Banerjee et al. 2007, Friedman et al. 2007)
  • 30. Model Fitting For UPG §  E Step: –  For each user i, obtain posterior §  M Step §  Let be the sample covariance matrix for graphical Lasso
  • 31.
  • 33. Real World Data from Yahoo! PA §  51 items §  Training data –  5M binary observations (click/non-click) –  140K users §  Test data –  Random bucket –  528K binary observations §  User features: Age, gender, behavior targeting
  • 34.
  • 36. What To Do When Not Enough Data Per Item? §  Example: –  CTR prediction for ad creatives/campaigns §  User i with feature xi §  Item j with feature xj §  Offline Model Online Model Component Component Agarwal et al. (KDD 2010)
  • 37. Large Scale Logistic Regression §  Naïve: –  Partition the data and run logistic regression for each partition –  Take the mean of the learned coefficients –  Problem: Not guaranteed to converge to the model from single machine! §  Alternating Direction Method of Multipliers (ADMM) –  Boyd et al. 2011 –  Set up a constraint that each partition’s coefficient = global consensus –  Solve the optimization problem using Lagrange Multipliers §  All-Reduce from Vowpal Wabbit (VW), Langford et al. –  Reducers talk to each other so that precise gradient can be computed by aggregating all computations from each partition (reducer).
  • 38. Ongoing Work at LinkedIn and Future Challenges §  Large scale statistical models for ad creative CTR prediction and ad creative ranking §  Explore-exploit for better ad serving strategy §  Incorporating social network signals into user profile (for cold start)
  • 39. Conclusion §  Generalized Matrix Factorization (GMF) framework to handle cold-start, feature selection and non-linearity simultaneously §  User factors from Parallelized GMF can serve as user profile for OLR, which gives state-of-the-art performance §  A new way to model item-item similarity for CTR prediction
  • 40. Thank You! Our Open Source Package for matrix factorization models: https://github.com/yahoo/Latent-Factor-Models Questions or feedback: liang.zhang.stat@gmail.com
  • 41. Bibliography Agarwal, D. and Chen, B. (2009). Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 19–28. ACM. Agarwal, D., Chen, B., and Elango, P. (2010). Fast online learning through offline initialization for time-sensitive recommendation. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 703–712. ACM. Agarwal, D., Zhang, L., and Mazumder, R. (2011). Modeling item–item similarities for personalized recommendations on Yahoo! front page. The Annals of Applied Statistics 5, 3, 1839–1875. Bell, R., Koren, Y., and Volinsky, C. (2007). Modeling relationships at multiple scales to improve accuracy of large recommender systems. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 95–104. ACM. Zhang, L., Agarwal, D., and Chen, B. (2011). Generalizing matrix factorization through flexible regression priors. In Proceedings of the fifth ACM conference on Recommender systems, 13–20. ACM. R Khanna, L Zhang, D Agarwal and Chen, B. (2012). Parallel Matrix Factorization for Binary Response. In Arxiv.org.