MLconf NYC Justin Basilico

  • 6,323 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
6,323
On Slideshare
0
From Embeds
0
Number of Embeds
17

Actions

Shares
Downloads
18
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Sources:2013 2H Sandvine report: https://www.sandvine.com/downloads/general/global-internet-phenomena/2013/2h-2013-global-internet-phenomena-snapshot-na-fixed.pdf100B events/day from http://www.slideshare.net/adrianco/netflix-nosql-searchhttp://www.businessweek.com/articles/2013-05-09/netflix-reed-hastings-survive-missteps-to-join-silicon-valleys-elite#p5
  • TODO: Transition here?
  • Jobposting: http://jobs.netflix.com/jobs.php?id=NFX01267

Transcript

  • 1. 11 Learning a Personalized Homepage Justin Basilico Page Algorithms Engineering April 11, 2014 NYC 2014 @JustinBasilico
  • 2. 2
  • 3. 3 Change of focus 2006 2014
  • 4. 4 Netflix Scale  > 44M members  > 40 countries  > 1000 device types  > 5B hours in Q3 2013  Plays: > 30M/day  Log 100B events/day  31.62% of peak US downstream traffic
  • 5. 55 Approach to Recommendation “Emmy Winning”
  • 6. 6 Goal Help members find content to watch and enjoy to maximize member satisfaction and retention
  • 7. 7 Everything is a Recommendation Rows Ranking Over 75% of what people watch comes from our recommendations Recommendations are driven by Machine Learning
  • 8. 8 Top Picks Personalization awareness Diversity
  • 9. 9 Personalized genres  Genres focused on user interest  Derived from tag combinations  Provide context and evidence  How are they generated?  Implicit: Based on recent plays, ratings & other interactions  Explicit: Taste preferences  Hybrid: combine the above
  • 10. 10 Similarity  Find something similar to something you’ve liked  Because you watched rows  Also  Video display page  In response to user actions (search, list add, …)
  • 11. 11 Support for Recommendations Social Support
  • 12. 1212 Learning to Recommend
  • 13. 13 Machine Learning Approach Problem Data ModelAlgorithm Metrics
  • 14. 14 Data  Plays  Duration, bookmark, time, devic e, …  Ratings  Metadata  Tags, synopsis, cast, …  Impressions  Interactions  Search, list add, scroll, …  Social
  • 15. 15 Models & Algorithms  Regression (Linear, logistic, elastic net)  SVD and other Matrix Factorizations  Factorization Machines  Restricted Boltzmann Machines  Deep Neural Networks  Markov Models and Graph Algorithms  Clustering  Latent Dirichlet Allocation  Gradient Boosted Decision Trees/Random Forests  Gaussian Processes  …
  • 16. 16 Offline/Online testing process Rollout Feature to all users Offline testing Online A/B testing[success] [success] [fail] days Weeks to months
  • 17. 17 Rating Prediction  First progress prize  Top 2 algorithms  Matrix Factorization (SVD++)  Restricted Boltzmann Machines (RBM)  Ensemble: Linear blend R Videos ≈ Users U V (99% Sparse) d Videos Users d×
  • 18. 18 Ranking by ratings 4.7 4.6 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 Niche titles High average ratings… by those who would watch it
  • 19. 19 RMSE
  • 20. 20 Learning to Rank  Approaches:  Point-wise: Loss over items (Classification, ordinal regression, MF, …)  Pair-wise: Loss over preferences (RankSVM, RankNet, BPR, …)  List-wise: (Smoothed) loss over ranking (LambdaMART, DirectRank, GAPfm, …)  Ranking quality measures:  NDCG, MRR, ERR, MAP, FCP, Precision@N , Recall@N, … 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 20 40 60 80 100 Importance Rank NDCG MRR FCP
  • 21. 21 Example: Two features, linear model Popularity PredictedRating 1 2 3 4 5 Linear Model: frank(u,v) = w1 p(v) + w2 r(u,v) FinalRanking
  • 22. 22 Ranking
  • 23. 2323 Putting it together “Learning to Row”
  • 24. 24 Page-level algorithmic challenge 10,000s of possible rows … 10-40 rows Variable number of possible videos per row (up to thousands) 1 personalized page per device
  • 25. 25 Balancing a Personalized Page vs.Accurate Diverse vs.Discovery Continuation vs.Depth Coverage vs.Freshness Stability vs.Recommendations Tasks
  • 26. 26 2D Navigational Modeling More likely to see Less likely
  • 27. 27 Row Lifecycle Select Candidates Select Evidence Rank Filter Format Choose
  • 28. 28 Building a page algorithmically  Approaches  Template: Non-personalized layout  Row-independent: Greedy rank rows by f(r | u, c)  Stage-wise: Pick next rows by f(r | u, c, p1:n)  Page-wise: Total page fitness f(p | u, c)  Obey constraints per device  Certain rows may be required  Examples: Continue watching and My List
  • 29. 29 Row Features  Quality of items  Features of items  Quality of evidence  User-row interactions  Item/row metadata  Recency  Item-row affinity  Row length  Position on page  Context  Title  Diversity  Freshness  …
  • 30. 30 Page-level Metrics  How do you measure the quality of the homepage?  Ease of discovery  Diversity  Novelty  …  Challenges:  Position effects  Row-video generalization  2D versions of ranking quality metrics  Example: Recall @ row-by-column 0 10 20 30 Recall Row
  • 31. 3131 Conclusions
  • 32. 32 Evolution of Recommendation Approach Rating Ranking Page Generation 4.7
  • 33. 33 Research Directions Context awareness Full-page optimization Presentation effects Social recommendation Personalized learning to rank Cold start
  • 34. 34 Thank You Justin Basilico jbasilico@netflix.com @JustinBasilico We’re hiring