11
Learning a Personalized Homepage
Justin Basilico
Page Algorithms Engineering April 11, 2014
NYC 2014
@JustinBasilico
2
3
Change of focus
2006 2014
4
Netflix Scale
 > 44M members
 > 40 countries
 > 1000 device types
 > 5B hours in Q3 2013
 Plays: > 30M/day
 Log 100B events/day
 31.62% of peak US
downstream traffic
55
Approach to Recommendation
“Emmy Winning”
6
Goal
Help members find content to watch and enjoy
to maximize member satisfaction and retention
7
Everything is a Recommendation
Rows
Ranking
Over 75% of what
people watch comes
from our
recommendations
Recommendations
are driven by
Machine Learning
8
Top Picks
Personalization awareness
Diversity
9
Personalized genres
 Genres focused on user interest
 Derived from tag combinations
 Provide context and evidence
 How are they generated?
 Implicit: Based on recent
plays, ratings & other
interactions
 Explicit: Taste preferences
 Hybrid: combine the above
10
Similarity
 Find something similar to
something you’ve liked
 Because you watched rows
 Also
 Video display page
 In response to user actions
(search, list add, …)
11
Support for Recommendations
Social Support
1212
Learning to Recommend
13
Machine Learning Approach
Problem
Data
ModelAlgorithm
Metrics
14
Data
 Plays
 Duration, bookmark, time, devic
e, …
 Ratings
 Metadata
 Tags, synopsis, cast, …
 Impressions
 Interactions
 Search, list add, scroll, …
 Social
15
Models & Algorithms
 Regression (Linear, logistic, elastic net)
 SVD and other Matrix Factorizations
 Factorization Machines
 Restricted Boltzmann Machines
 Deep Neural Networks
 Markov Models and Graph Algorithms
 Clustering
 Latent Dirichlet Allocation
 Gradient Boosted Decision
Trees/Random Forests
 Gaussian Processes
 …
16
Offline/Online testing process
Rollout
Feature to
all users
Offline
testing
Online A/B
testing[success] [success]
[fail]
days Weeks to months
17
Rating Prediction
 First progress prize
 Top 2 algorithms
 Matrix Factorization (SVD++)
 Restricted Boltzmann Machines
(RBM)
 Ensemble: Linear blend
R
Videos
≈
Users
U
V
(99% Sparse) d
Videos
Users
d×
18
Ranking by ratings
4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5
Niche titles
High average ratings… by those who would watch it
19
RMSE
20
Learning to Rank
 Approaches:
 Point-wise: Loss over items
(Classification, ordinal regression, MF, …)
 Pair-wise: Loss over preferences
(RankSVM, RankNet, BPR, …)
 List-wise: (Smoothed) loss over ranking
(LambdaMART, DirectRank, GAPfm, …)
 Ranking quality measures:
 NDCG, MRR, ERR, MAP, FCP, Precision@N
, Recall@N, …
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100
Importance
Rank
NDCG MRR FCP
21
Example: Two features, linear model
Popularity
PredictedRating
1
2
3
4
5
Linear Model:
frank(u,v) = w1 p(v) + w2 r(u,v)
FinalRanking
22
Ranking
2323
Putting it together
“Learning to Row”
24
Page-level algorithmic challenge
10,000s of
possible
rows …
10-40
rows
Variable number of
possible videos per
row (up to thousands)
1 personalized page
per device
25
Balancing a Personalized Page
vs.Accurate Diverse
vs.Discovery Continuation
vs.Depth Coverage
vs.Freshness Stability
vs.Recommendations Tasks
26
2D Navigational Modeling
More likely
to see
Less likely
27
Row Lifecycle
Select
Candidates
Select
Evidence
Rank
Filter
Format
Choose
28
Building a page algorithmically
 Approaches
 Template: Non-personalized layout
 Row-independent: Greedy rank rows by f(r | u, c)
 Stage-wise: Pick next rows by f(r | u, c, p1:n)
 Page-wise: Total page fitness f(p | u, c)
 Obey constraints per device
 Certain rows may be required
 Examples: Continue watching and My List
29
Row Features
 Quality of items
 Features of items
 Quality of evidence
 User-row interactions
 Item/row metadata
 Recency
 Item-row affinity
 Row length
 Position on page
 Context
 Title
 Diversity
 Freshness
 …
30
Page-level Metrics
 How do you measure the quality of
the homepage?
 Ease of discovery
 Diversity
 Novelty
 …
 Challenges:
 Position effects
 Row-video generalization
 2D versions of ranking quality
metrics
 Example: Recall @ row-by-column
0 10 20 30
Recall Row
3131
Conclusions
32
Evolution of Recommendation Approach
Rating Ranking Page Generation
4.7
33
Research Directions
Context
awareness
Full-page
optimization
Presentation
effects
Social
recommendation
Personalized
learning to rank
Cold start
34
Thank You Justin Basilico
jbasilico@netflix.com
@JustinBasilico
We’re hiring

Learning a Personalized Homepage

Editor's Notes

  • #5 Sources:2013 2H Sandvine report: https://www.sandvine.com/downloads/general/global-internet-phenomena/2013/2h-2013-global-internet-phenomena-snapshot-na-fixed.pdf100B events/day from http://www.slideshare.net/adrianco/netflix-nosql-searchhttp://www.businessweek.com/articles/2013-05-09/netflix-reed-hastings-survive-missteps-to-join-silicon-valleys-elite#p5
  • #32 TODO: Transition here?
  • #35 Jobposting: http://jobs.netflix.com/jobs.php?id=NFX01267