SlideShare a Scribd company logo
1 of 45
Download to read offline
Big & Personal: the data
and the models behind
Netflix recommendations
Outline
1. The Netflix Prize & the Recommendation
Problem
2. Anatomy of Netflix Personalization
3. Data & Models
4. More data or better Models?
What we were interested in:
■ High quality recommendations
Proxy question:
■ Accuracy in predicted rating
■ Improve by 10% = $1million!
● Top 2 algorithms still in
production
Results
SVD
RBM
What about the final prize ensembles?
■ Our offline studies showed they were too computationally
intensive to scale
■ Expected improvement not worth the engineering effort
■ Plus…. Focus had already shifted to other issues that
had more impact than rating prediction.
Change of focus
2006 2013
Anatomy of
Netflix
Personalization
Everything is a Recommendation
Everything is personalized
Note:
Recommendations
are per household,
not individual user
Ranking
Top 10
Personalization awareness
Diversity
DadAll SonDaughterDad&Mom MomAll Daughter MomAll?
Support for Recommendations
Social Support
Social Recommendations
Genre rows
■ Personalized genre rows focus on user interest
■ Also provide context and “evidence”
■ Important for member satisfaction – moving personalized
rows to top on devices increased retention
■ How are they generated?
■ Implicit: based on user’s recent plays, ratings, & other
interactions
■ Explicit taste preferences
■ Hybrid:combine the above
■ Also take into account:
■ Freshness - has this been shown before?
■ Diversity– avoid repeating tags and genres, limit number
of TV genres, etc.
Genres - personalization
■ Displayed in
many different
contexts
■ In response to
user
actions/context
(search, queue
add…)
■ More like… rows
Similars
Data
&
Models
Big Data @Netflix ■ Almost 40M subscribers
■ Ratings: 4M/day
■ Searches: 3M/day
■ Plays: 30M/day
■ 2B hours streamed in Q4
2011
■ 1B hours in June 2012
■ > 4B hours in Q1 2013
Member Behavior
Geo-informationTime
Impressions
Device Info
Metadata
Social
Smart Models
■ Logistic/linear regression
■ Elastic nets
■ SVD and other MF models
■ Factorization Machines
■ Restricted Boltzmann Machines
■ Markov Chains
■ Different clustering approaches
■ LDA
■ Association Rules
■ Gradient Boosted Decision
Trees/Random Forests
■ …
SVD
X[n x m]
= U[n x r]
S [ r x r]
(V[m x r]
)T
■ X: m x n matrix (e.g., m users, n videos)
■ U: m x r matrix (m users, r factors)
■ S: r x r diagonal matrix (strength of each ‘factor’) (r: rank of the matrix)
■ V: r x n matrix (n videos, r factor)
SVD for Rating Prediction
■ User factor vectors and item-factors vector
■ Baseline (bias) (user & item deviation from average)
■ Predict rating as
■ SVD++ (Koren et. Al) asymmetric variation w. implicit feedback
■ Where
■ are three item factor vectors
■ Users are not parametrized, but rather represented by:
■ R(u): items rated by user u
■ N(u): items for which the user has given implicit preference (e.g. rated vs. not
rated)
Simon Funk’s SVD
■ One of the most
interesting findings
during the Netflix
Prize came out of a
blog post
■ Incremental, iterative,
and approximate way
to compute the SVD
using gradient
descent
Restricted Boltzmann Machines
■ Restrict the connectivity in ANN to make learning
easier.
■ Only one layer of hidden units.
■ Although multiple layers are possible
■ No connections between hidden units.
■ Hidden units are independent given the visible
states..
■ RBMs can be stacked to form Deep Belief
Networks (DBN) – 4th
generation of ANNs
hidden
i
j
visible
RBM for the Netflix Prize
Ranking Key algorithm, sorts titles in most
contexts
Ranking
■ Ranking = Scoring + Sorting + Filtering
bags of movies for presentation to a user
■ Goal: Find the best possible ordering of a
set of videos for a user within a specific
context in real-time
■ Objective: maximize consumption
■ Aspirations: Played & “enjoyed” titles have
best score
■ Akin to CTR forecast for ads/search results
■ Factors
■ Accuracy
■ Novelty
■ Diversity
■ Freshness
■ Scalability
■ …
Example: Two features, linear model
Example: Two features, linear model
Ranking
Ranking
Ranking
Novelty
Diversity
Freshness
Accuracy
Scalability
Learning to rank
■ Machine learning problem: goal is to construct ranking
model from training data
■ Training data can have partial order or binary judgments
(relevant/not relevant).
■ Resulting order of the items typically induced from a
numerical score
■ Learning to rank is a key element for personalization
■ You can treat the problem as a standard supervised
classification problem
Learning to Rank Approaches
1. Pointwise
■ Ranking function minimizes loss function defined on individual
relevance judgment
■ Ranking score based on regression or classification
■ Ordinal regression, Logistic regression, SVM, GBDT, …
2. Pairwise
■ Loss function is defined on pair-wise preferences
■ Goal: minimize number of inversions in ranking
■ Ranking problem is then transformed into the binary classification
problem
■ RankSVM, RankBoost, RankNet, FRank…
Learning to rank - metrics
■ Quality of ranking measured using metrics as
■ Normalized Discounted Cumulative Gain
■ Mean Reciprocal Rank (MRR)
■ Fraction of Concordant Pairs (FCP)
■ Others…
■ But, it is hard to optimize machine-learned
models directly on these measures (they are
not differentiable)
■ Recent research on models that directly
optimize ranking measures
Learning to Rank Approaches
3. Listwise
a. Indirect Loss Function
■ RankCosine: similarity between ranking list and ground truth as loss function
■ ListNet: KL-divergence as loss function by defining a probability distribution
■ Problem: optimization of listwise loss function may not optimize IR metrics
b. Directly optimizing IR measures (difficult since they are not differentiable)
■ Directly optimize IR measures through Genetic Programming or Simulated
Annealing
■ Gradient descent on smoothed version of objective function (e.g. CLiMF at
Recsys 2012 or TFMAP at SIGIR 2012)
■ SVM-MAP relaxes the MAP metric by adding it to the SVM constraints
■ AdaRank uses boosting to optimize NDCG
Other research questions we are interested on
● Row selection
○ How to select and rank lists of “related” items imposing inter-
group diversity, avoiding duplicates...
● Diversity
○ Can we increase diversity while preserving relevance in a way
that we optimize user response?
● Similarity
○ How to compute optimal and personalized similarity between
items by using different data that can range from play histories
to item metadata
● Context-aware recommendations
● Mood and session intent inference
● ...
More data or
better
models?
More data or better models?
Really?
Anand Rajaraman: Stanford & Senior VP at
Walmart Global eCommerce (former Kosmix)
Sometimes, it’s not
about more data
More data or better models?
[Banko and Brill, 2001]
Norvig: “Google does not
have better Algorithms,
only more Data”
Many features/
low-bias models
More data or better models?
More data or better models?
Sometimes, it’s not
about more data
X
More data or better models?
Data without a sound approach = noise
Conclusions
The Personalization Problem
■ The Netflix Prize simplified the recommendation problem
to predicting ratings
■ But…
■ User ratings are only one of the many data inputs we have
■ Rating predictions are only part of our solution
■ Other algorithms such as ranking or similarity are very important
■ We can reformulate the recommendation problem
■ Function to optimize: probability a user chooses something and
enjoys it enough to come back to the service
More data +
Better models +
More accurate metrics +
Better approaches & architectures
Lots of room for improvement!
Thanks!
Xavier Amatriain (@xamat)
xavier@netflix.com
We’re hiring!

More Related Content

What's hot

Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain
 
Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]Jenia Gorokhovsky
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsAlejandro Bellogin
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraXavier Amatriain
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Efficient Top-N Recommendation by Linear Regression
Efficient Top-N Recommendation by Linear RegressionEfficient Top-N Recommendation by Linear Regression
Efficient Top-N Recommendation by Linear RegressionMark Levy
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedXavier Amatriain
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...Alejandro Bellogin
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetCrossing Minds
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperChangsung Moon
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
 
Applications of Machine Learning
Applications of Machine LearningApplications of Machine Learning
Applications of Machine LearningHayim Makabee
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 

What's hot (20)

Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 
Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]Machine learning the high interest credit card of technical debt [PWL]
Machine learning the high interest credit card of technical debt [PWL]
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Efficient Top-N Recommendation by Linear Regression
Efficient Top-N Recommendation by Linear RegressionEfficient Top-N Recommendation by Linear Regression
Efficient Top-N Recommendation by Linear Regression
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paper
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Applications of Machine Learning
Applications of Machine LearningApplications of Machine Learning
Applications of Machine Learning
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 

Viewers also liked

Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosBigMine
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...BigMine
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...BigMine
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsMonal Daxini
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsGaurav Dutta
 
Netflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsNetflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsBlake Irvine
 
Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudySaket Toshniwal
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPancrazio Auteri
 
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)Balancing Discovery and continuation in recommendation (hossein taghavi netflix)
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)IntoTheMinds
 
Managing your Personal Big Data
Managing your Personal Big DataManaging your Personal Big Data
Managing your Personal Big DataEric Y.F. Lim
 
Tank Top TV - Netflix viewing data
Tank Top TV - Netflix viewing dataTank Top TV - Netflix viewing data
Tank Top TV - Netflix viewing dataLiz Rice
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Lucidworks
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - SearchBillCavaUs
 
Big Data Examples
Big Data ExamplesBig Data Examples
Big Data ExamplesOzan Saglam
 
Big Data in Cancer Control
Big Data in Cancer ControlBig Data in Cancer Control
Big Data in Cancer ControlBradford Hesse
 

Viewers also liked (20)

Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
Netflix Case Study
Netflix Case StudyNetflix Case Study
Netflix Case Study
 
Netflix Case Study
Netflix Case StudyNetflix Case Study
Netflix Case Study
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hits
 
Netflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of AnalyticsNetflix - Enabling a Culture of Analytics
Netflix - Enabling a Culture of Analytics
 
Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case Study
 
Case Study Netflix
Case Study NetflixCase Study Netflix
Case Study Netflix
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from Netflix
 
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)Balancing Discovery and continuation in recommendation (hossein taghavi netflix)
Balancing Discovery and continuation in recommendation (hossein taghavi netflix)
 
Managing your Personal Big Data
Managing your Personal Big DataManaging your Personal Big Data
Managing your Personal Big Data
 
Tank Top TV - Netflix viewing data
Tank Top TV - Netflix viewing dataTank Top TV - Netflix viewing data
Tank Top TV - Netflix viewing data
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
 
Big Data analytics usage
Big Data analytics usageBig Data analytics usage
Big Data analytics usage
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - Search
 
Big Data Examples
Big Data ExamplesBig Data Examples
Big Data Examples
 
Big Data in Cancer Control
Big Data in Cancer ControlBig Data in Cancer Control
Big Data in Cancer Control
 

Similar to Big & Personal: the data and the models behind Netflix recommendations by Xavier Amatriain

acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013MLconf
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessChia-Chi Chang
 
How to Use Machine Learning as a Product Manager by Wework PM
 How to Use Machine Learning as a Product Manager by Wework PM How to Use Machine Learning as a Product Manager by Wework PM
How to Use Machine Learning as a Product Manager by Wework PMProduct School
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Overview of machine learning
Overview of machine learning Overview of machine learning
Overview of machine learning SolivarLabs
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsJustin Basilico
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 

Similar to Big & Personal: the data and the models behind Netflix recommendations by Xavier Amatriain (20)

acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Recommender systems
Recommender systems Recommender systems
Recommender systems
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Role of Data Science in eCommerce
Role of Data Science in eCommerceRole of Data Science in eCommerce
Role of Data Science in eCommerce
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
How to Use Machine Learning as a Product Manager by Wework PM
 How to Use Machine Learning as a Product Manager by Wework PM How to Use Machine Learning as a Product Manager by Wework PM
How to Use Machine Learning as a Product Manager by Wework PM
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Data science guide
Data science guideData science guide
Data science guide
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Overview of machine learning
Overview of machine learning Overview of machine learning
Overview of machine learning
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 

More from BigMine

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...BigMine
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...BigMine
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16BigMine
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBigMine
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeBigMine
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles ParkerBigMine
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 

More from BigMine (7)

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina Morik
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping Ye
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Big & Personal: the data and the models behind Netflix recommendations by Xavier Amatriain

  • 1. Big & Personal: the data and the models behind Netflix recommendations
  • 2. Outline 1. The Netflix Prize & the Recommendation Problem 2. Anatomy of Netflix Personalization 3. Data & Models 4. More data or better Models?
  • 3.
  • 4. What we were interested in: ■ High quality recommendations Proxy question: ■ Accuracy in predicted rating ■ Improve by 10% = $1million! ● Top 2 algorithms still in production Results SVD RBM
  • 5. What about the final prize ensembles? ■ Our offline studies showed they were too computationally intensive to scale ■ Expected improvement not worth the engineering effort ■ Plus…. Focus had already shifted to other issues that had more impact than rating prediction.
  • 8. Everything is personalized Note: Recommendations are per household, not individual user Ranking
  • 9. Top 10 Personalization awareness Diversity DadAll SonDaughterDad&Mom MomAll Daughter MomAll?
  • 12. Genre rows ■ Personalized genre rows focus on user interest ■ Also provide context and “evidence” ■ Important for member satisfaction – moving personalized rows to top on devices increased retention ■ How are they generated? ■ Implicit: based on user’s recent plays, ratings, & other interactions ■ Explicit taste preferences ■ Hybrid:combine the above ■ Also take into account: ■ Freshness - has this been shown before? ■ Diversity– avoid repeating tags and genres, limit number of TV genres, etc.
  • 14. ■ Displayed in many different contexts ■ In response to user actions/context (search, queue add…) ■ More like… rows Similars
  • 16. Big Data @Netflix ■ Almost 40M subscribers ■ Ratings: 4M/day ■ Searches: 3M/day ■ Plays: 30M/day ■ 2B hours streamed in Q4 2011 ■ 1B hours in June 2012 ■ > 4B hours in Q1 2013 Member Behavior Geo-informationTime Impressions Device Info Metadata Social
  • 17. Smart Models ■ Logistic/linear regression ■ Elastic nets ■ SVD and other MF models ■ Factorization Machines ■ Restricted Boltzmann Machines ■ Markov Chains ■ Different clustering approaches ■ LDA ■ Association Rules ■ Gradient Boosted Decision Trees/Random Forests ■ …
  • 18. SVD X[n x m] = U[n x r] S [ r x r] (V[m x r] )T ■ X: m x n matrix (e.g., m users, n videos) ■ U: m x r matrix (m users, r factors) ■ S: r x r diagonal matrix (strength of each ‘factor’) (r: rank of the matrix) ■ V: r x n matrix (n videos, r factor)
  • 19. SVD for Rating Prediction ■ User factor vectors and item-factors vector ■ Baseline (bias) (user & item deviation from average) ■ Predict rating as ■ SVD++ (Koren et. Al) asymmetric variation w. implicit feedback ■ Where ■ are three item factor vectors ■ Users are not parametrized, but rather represented by: ■ R(u): items rated by user u ■ N(u): items for which the user has given implicit preference (e.g. rated vs. not rated)
  • 20. Simon Funk’s SVD ■ One of the most interesting findings during the Netflix Prize came out of a blog post ■ Incremental, iterative, and approximate way to compute the SVD using gradient descent
  • 21. Restricted Boltzmann Machines ■ Restrict the connectivity in ANN to make learning easier. ■ Only one layer of hidden units. ■ Although multiple layers are possible ■ No connections between hidden units. ■ Hidden units are independent given the visible states.. ■ RBMs can be stacked to form Deep Belief Networks (DBN) – 4th generation of ANNs hidden i j visible
  • 22. RBM for the Netflix Prize
  • 23. Ranking Key algorithm, sorts titles in most contexts
  • 24. Ranking ■ Ranking = Scoring + Sorting + Filtering bags of movies for presentation to a user ■ Goal: Find the best possible ordering of a set of videos for a user within a specific context in real-time ■ Objective: maximize consumption ■ Aspirations: Played & “enjoyed” titles have best score ■ Akin to CTR forecast for ads/search results ■ Factors ■ Accuracy ■ Novelty ■ Diversity ■ Freshness ■ Scalability ■ …
  • 25. Example: Two features, linear model
  • 26. Example: Two features, linear model
  • 30. Learning to rank ■ Machine learning problem: goal is to construct ranking model from training data ■ Training data can have partial order or binary judgments (relevant/not relevant). ■ Resulting order of the items typically induced from a numerical score ■ Learning to rank is a key element for personalization ■ You can treat the problem as a standard supervised classification problem
  • 31. Learning to Rank Approaches 1. Pointwise ■ Ranking function minimizes loss function defined on individual relevance judgment ■ Ranking score based on regression or classification ■ Ordinal regression, Logistic regression, SVM, GBDT, … 2. Pairwise ■ Loss function is defined on pair-wise preferences ■ Goal: minimize number of inversions in ranking ■ Ranking problem is then transformed into the binary classification problem ■ RankSVM, RankBoost, RankNet, FRank…
  • 32. Learning to rank - metrics ■ Quality of ranking measured using metrics as ■ Normalized Discounted Cumulative Gain ■ Mean Reciprocal Rank (MRR) ■ Fraction of Concordant Pairs (FCP) ■ Others… ■ But, it is hard to optimize machine-learned models directly on these measures (they are not differentiable) ■ Recent research on models that directly optimize ranking measures
  • 33. Learning to Rank Approaches 3. Listwise a. Indirect Loss Function ■ RankCosine: similarity between ranking list and ground truth as loss function ■ ListNet: KL-divergence as loss function by defining a probability distribution ■ Problem: optimization of listwise loss function may not optimize IR metrics b. Directly optimizing IR measures (difficult since they are not differentiable) ■ Directly optimize IR measures through Genetic Programming or Simulated Annealing ■ Gradient descent on smoothed version of objective function (e.g. CLiMF at Recsys 2012 or TFMAP at SIGIR 2012) ■ SVM-MAP relaxes the MAP metric by adding it to the SVM constraints ■ AdaRank uses boosting to optimize NDCG
  • 34. Other research questions we are interested on ● Row selection ○ How to select and rank lists of “related” items imposing inter- group diversity, avoiding duplicates... ● Diversity ○ Can we increase diversity while preserving relevance in a way that we optimize user response? ● Similarity ○ How to compute optimal and personalized similarity between items by using different data that can range from play histories to item metadata ● Context-aware recommendations ● Mood and session intent inference ● ...
  • 36. More data or better models? Really? Anand Rajaraman: Stanford & Senior VP at Walmart Global eCommerce (former Kosmix)
  • 37. Sometimes, it’s not about more data More data or better models?
  • 38. [Banko and Brill, 2001] Norvig: “Google does not have better Algorithms, only more Data” Many features/ low-bias models More data or better models?
  • 39. More data or better models? Sometimes, it’s not about more data
  • 40. X More data or better models?
  • 41. Data without a sound approach = noise
  • 43. The Personalization Problem ■ The Netflix Prize simplified the recommendation problem to predicting ratings ■ But… ■ User ratings are only one of the many data inputs we have ■ Rating predictions are only part of our solution ■ Other algorithms such as ranking or similarity are very important ■ We can reformulate the recommendation problem ■ Function to optimize: probability a user chooses something and enjoys it enough to come back to the service
  • 44. More data + Better models + More accurate metrics + Better approaches & architectures Lots of room for improvement!