SlideShare a Scribd company logo
1 of 35
1 
Learning to Personalize 
Justin Basilico 
Page Algorithms Engineering September 19, 2014 
@JustinBasilico 
ATL 2014
2 
 Interested in high-quality 
recommendations 
 Proxy question: 
 Accuracy in predicted rating 
 Measured by root mean 
squared error (RMSE) 
 Improve by 10% = $1 million! 
 Data size: 
 100M ratings (back then 
“almost massive”)
3 
Change of focus 
2006 2014
4 
Netflix Scale 
 > 50M members 
 > 40 countries 
 > 1000 device types 
 Hours: > 1B/month 
 Plays: > 30M/day 
 Log 100B events/day 
 34.2% of peak US 
downstream traffic
5 
Goal 
Help members find content to watch and enjoy 
to maximize member satisfaction and retention
6 
“Emmy Winning” 
Approach to Recommendation
7 
Everything is a Recommendation 
Rows 
Ranking 
Over 75% of what 
people watch 
comes from our 
recommendations 
Recommendations 
are driven by 
Machine Learning
8 
Top Picks 
Personalization awareness 
Diversity
9 
Personalized genres 
 Genres focused on user interest 
 Derived from tag combinations 
 Provide context and evidence 
 How are they generated? 
 Implicit: Based on recent plays, 
ratings & other interactions 
 Explicit: Taste preferences
10 
Similarity 
 Recommend videos similar 
to one you’ve liked 
 “Because you watched” 
rows 
 Pivots 
 Video information page 
 In response to user actions 
(search, list add, …)
11 
Support for Recommendations 
Behavioral Support Social Support
12 
Learning to Recommend
13 
Machine Learning Approach 
Problem 
Data 
Metrics 
Model Algorithm
14 
Data 
 Plays 
 Duration, bookmark, time, 
device, … 
 Ratings 
 Metadata 
 Tags, synopsis, cast, … 
 Impressions 
 Interactions 
 Search, list add, scroll, … 
 Social
15 
Models & Algorithms 
 Regression (Linear, logistic, elastic net) 
 SVD and other Matrix Factorizations 
 Factorization Machines 
 Restricted Boltzmann Machines 
 Deep Neural Networks 
 Markov Models and Graph Algorithms 
 Clustering 
 Latent Dirichlet Allocation 
 Gradient Boosted Decision 
Trees/Random Forests 
 Gaussian Processes 
 …
16 
Rating Prediction 
 Based on first year progress prize 
 Top 2 algorithms 
 Matrix Factorization (SVD++) 
 Restricted Boltzmann Machines 
(RBM) 
 Ensemble: Linear blend 
Videos 
R 
≈ 
Users 
U 
V 
(99% Sparse) d 
Videos 
Users 
× d
17 
Ranking by ratings 
4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 
Niche titles 
High average ratings… by those who would watch it
18 
RMSE
19 
Learning to Rank 
 Approaches: 
 Point-wise: Loss over items 
(Classification, ordinal regression, MF, …) 
 Pair-wise: Loss over preferences 
(RankSVM, RankNet, BPR, …) 
 List-wise: (Smoothed) loss over ranking 
(LambdaMART, DirectRank, GAPfm, …) 
 Ranking quality measures: 
 Recall@N, Precision@N, NDCG, MRR, 
ERR, MAP, FCP, … 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
0 20 40 60 80 100 
Importance 
Rank 
NDCG MRR FCP
20 
Example: Two features, linear model 
Popularity 
Predicted Rating 
1 
2 
3 
4 
5 
Linear Model: 
Final Ranking 
frank(u,v) = w1 p(v) + w2 r(u,v)
21 
Personalized Ranking
22 
“Learning to Row” 
Putting a page together
23 
Page-level algorithmic challenge 
10,000s of 
possible 
rows … 
10-40 
rows 
Variable number of 
possible videos per 
row (up to thousands) 
1 personalized page 
per device
24 
Balancing a Personalized Page 
Accurate vs. Diverse 
Discovery vs. Continuation 
Depth vs. Coverage 
Freshness vs. Stability 
Recommendations vs. Tasks
25 
2D Navigational Modeling 
More likely 
to see 
Less likely
26 
Building a page algorithmically 
 Approaches 
 Template: Non-personalized layout 
 Row-independent: Greedy rank rows by f(r | u, c) 
 Stage-wise: Pick next rows by f(r | u, c, p1:n) 
 Page-wise: Total page fitness f(p | u, c) 
 Obey constraints 
 Certain rows may be required (Continue Watching 
and My List) 
 Filter, de-duplicate 
 Format for device
27 
Row Features 
 Quality of items 
 Features of items 
 Quality of evidence 
 User-row interactions 
 Item/row metadata 
 Recency 
 Item-row affinity 
 Row length 
 Position on page 
 Title 
 Diversity 
 Similarity 
 Freshness 
 …
28 
Page-level Metrics 
 How do you measure the quality of 
the homepage? 
 Ease of discovery, Diversity, 
Novelty, … 
 Challenges: 
 Position effects 
 Row-video generalization 
 2D versions of ranking quality 
metrics 
 Example: Recall @ row-by-column 
0 10 20 30 
Recall 
Row
29 
Learning in the Cloud
30 
Three levels of Learning Distribution/Parallelization 
1. For each subset of the population (e.g. 
region) 
 Want independently trained and tuned models 
2. For each combination of hyperparameters 
 Simple: Grid search 
 Better: Bayesian optimization using Gaussian 
Processes 
3. For each subset of the training data 
 Distribute over machines (e.g. ADMM) 
 Multi-core parallelism (e.g. HogWild) 
 Or… use GPUs
31 
Example: Training Neural Networks 
 Level 1: Machines in different 
AWS regions 
 Level 2: Machines in same AWS 
region 
 Spearmint or MOE for parameter 
optimization 
 Condor, StarCluster, Mesos, etc. for 
coordination 
 Level 3: Highly optimized, parallel 
CUDA code on GPUs
32 
Conclusions
33 
Evolution of Recommendation Approach 
4.7 
Rating Ranking Page Generation
34 
Research Directions 
Context 
awareness 
Full-page 
optimization 
Presentation 
effects 
Scaling 
Personalized 
learning to 
rank 
Cold start
Thank You Justin Basilico 
jbasilico@netflix.com 
35 @JustinBasilico 
We’re hiring

More Related Content

What's hot

Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsXavier Amatriain
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020Zachary Schendel
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender modelsParmeshwar Khurd
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixGrace T. Huang
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 

What's hot (20)

Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 Stars
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 

Similar to Learning to Personalize

MLconf NYC Justin Basilico
MLconf NYC Justin BasilicoMLconf NYC Justin Basilico
MLconf NYC Justin BasilicoMLconf
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Gabriel Moreira
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueXavier Amatriain
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Telecom datascience master_public
Telecom datascience master_publicTelecom datascience master_public
Telecom datascience master_publicVincent Michel
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsQuantUniversity
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...Multimedia Communications Lab
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixXavier Amatriain
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013MLconf
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
 
Big Data LDN 2017: Serving Predictive Models with Redis
Big Data LDN 2017: Serving Predictive Models with RedisBig Data LDN 2017: Serving Predictive Models with Redis
Big Data LDN 2017: Serving Predictive Models with RedisMatt Stubbs
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer LearningDanielle Dean
 
Optimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBMYaman Kumar
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 

Similar to Learning to Personalize (20)

MLconf NYC Justin Basilico
MLconf NYC Justin BasilicoMLconf NYC Justin Basilico
MLconf NYC Justin Basilico
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Telecom datascience master_public
Telecom datascience master_publicTelecom datascience master_public
Telecom datascience master_public
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...
Social Learning and Knowledge Sharing Technologies Lecture Slides about Socia...
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Big Data LDN 2017: Serving Predictive Models with Redis
Big Data LDN 2017: Serving Predictive Models with RedisBig Data LDN 2017: Serving Predictive Models with Redis
Big Data LDN 2017: Serving Predictive Models with Redis
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
Optimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBM
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Learning to Personalize

  • 1. 1 Learning to Personalize Justin Basilico Page Algorithms Engineering September 19, 2014 @JustinBasilico ATL 2014
  • 2. 2  Interested in high-quality recommendations  Proxy question:  Accuracy in predicted rating  Measured by root mean squared error (RMSE)  Improve by 10% = $1 million!  Data size:  100M ratings (back then “almost massive”)
  • 3. 3 Change of focus 2006 2014
  • 4. 4 Netflix Scale  > 50M members  > 40 countries  > 1000 device types  Hours: > 1B/month  Plays: > 30M/day  Log 100B events/day  34.2% of peak US downstream traffic
  • 5. 5 Goal Help members find content to watch and enjoy to maximize member satisfaction and retention
  • 6. 6 “Emmy Winning” Approach to Recommendation
  • 7. 7 Everything is a Recommendation Rows Ranking Over 75% of what people watch comes from our recommendations Recommendations are driven by Machine Learning
  • 8. 8 Top Picks Personalization awareness Diversity
  • 9. 9 Personalized genres  Genres focused on user interest  Derived from tag combinations  Provide context and evidence  How are they generated?  Implicit: Based on recent plays, ratings & other interactions  Explicit: Taste preferences
  • 10. 10 Similarity  Recommend videos similar to one you’ve liked  “Because you watched” rows  Pivots  Video information page  In response to user actions (search, list add, …)
  • 11. 11 Support for Recommendations Behavioral Support Social Support
  • 12. 12 Learning to Recommend
  • 13. 13 Machine Learning Approach Problem Data Metrics Model Algorithm
  • 14. 14 Data  Plays  Duration, bookmark, time, device, …  Ratings  Metadata  Tags, synopsis, cast, …  Impressions  Interactions  Search, list add, scroll, …  Social
  • 15. 15 Models & Algorithms  Regression (Linear, logistic, elastic net)  SVD and other Matrix Factorizations  Factorization Machines  Restricted Boltzmann Machines  Deep Neural Networks  Markov Models and Graph Algorithms  Clustering  Latent Dirichlet Allocation  Gradient Boosted Decision Trees/Random Forests  Gaussian Processes  …
  • 16. 16 Rating Prediction  Based on first year progress prize  Top 2 algorithms  Matrix Factorization (SVD++)  Restricted Boltzmann Machines (RBM)  Ensemble: Linear blend Videos R ≈ Users U V (99% Sparse) d Videos Users × d
  • 17. 17 Ranking by ratings 4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 Niche titles High average ratings… by those who would watch it
  • 19. 19 Learning to Rank  Approaches:  Point-wise: Loss over items (Classification, ordinal regression, MF, …)  Pair-wise: Loss over preferences (RankSVM, RankNet, BPR, …)  List-wise: (Smoothed) loss over ranking (LambdaMART, DirectRank, GAPfm, …)  Ranking quality measures:  Recall@N, Precision@N, NDCG, MRR, ERR, MAP, FCP, … 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 20 40 60 80 100 Importance Rank NDCG MRR FCP
  • 20. 20 Example: Two features, linear model Popularity Predicted Rating 1 2 3 4 5 Linear Model: Final Ranking frank(u,v) = w1 p(v) + w2 r(u,v)
  • 22. 22 “Learning to Row” Putting a page together
  • 23. 23 Page-level algorithmic challenge 10,000s of possible rows … 10-40 rows Variable number of possible videos per row (up to thousands) 1 personalized page per device
  • 24. 24 Balancing a Personalized Page Accurate vs. Diverse Discovery vs. Continuation Depth vs. Coverage Freshness vs. Stability Recommendations vs. Tasks
  • 25. 25 2D Navigational Modeling More likely to see Less likely
  • 26. 26 Building a page algorithmically  Approaches  Template: Non-personalized layout  Row-independent: Greedy rank rows by f(r | u, c)  Stage-wise: Pick next rows by f(r | u, c, p1:n)  Page-wise: Total page fitness f(p | u, c)  Obey constraints  Certain rows may be required (Continue Watching and My List)  Filter, de-duplicate  Format for device
  • 27. 27 Row Features  Quality of items  Features of items  Quality of evidence  User-row interactions  Item/row metadata  Recency  Item-row affinity  Row length  Position on page  Title  Diversity  Similarity  Freshness  …
  • 28. 28 Page-level Metrics  How do you measure the quality of the homepage?  Ease of discovery, Diversity, Novelty, …  Challenges:  Position effects  Row-video generalization  2D versions of ranking quality metrics  Example: Recall @ row-by-column 0 10 20 30 Recall Row
  • 29. 29 Learning in the Cloud
  • 30. 30 Three levels of Learning Distribution/Parallelization 1. For each subset of the population (e.g. region)  Want independently trained and tuned models 2. For each combination of hyperparameters  Simple: Grid search  Better: Bayesian optimization using Gaussian Processes 3. For each subset of the training data  Distribute over machines (e.g. ADMM)  Multi-core parallelism (e.g. HogWild)  Or… use GPUs
  • 31. 31 Example: Training Neural Networks  Level 1: Machines in different AWS regions  Level 2: Machines in same AWS region  Spearmint or MOE for parameter optimization  Condor, StarCluster, Mesos, etc. for coordination  Level 3: Highly optimized, parallel CUDA code on GPUs
  • 33. 33 Evolution of Recommendation Approach 4.7 Rating Ranking Page Generation
  • 34. 34 Research Directions Context awareness Full-page optimization Presentation effects Scaling Personalized learning to rank Cold start
  • 35. Thank You Justin Basilico jbasilico@netflix.com 35 @JustinBasilico We’re hiring