Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model

177 views

Published on

Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model

At Netflix, our main goal is to maximize our members’ enjoyment of the selected show by minimizing the amount of time it takes for them to find it. We try to achieve this goal by personalizing almost all the aspects of our product -- from what shows to recommend, to how to present these shows and construct their home-pages to what images to select per show, among many other things. Everything is recommendations for us and as an applied Machine Learning group, we spend our time building models for personalization that will eventually increase the joy and satisfaction of our members. In this talk we will primarily focus our attention on a) making a global deep learned recommender model that is regional tastes and popularity aware and b) adapting this model to changing taste preferences as well as dynamic catalog availability.

We will first go through some standard recommender system models that use Matrix Factorization and Topic Models and then compare and contrast them with more powerful and higher capacity deep learning based models such as sequence models that use recurrent neural networks. We will show what it entails to build a global model that is aware of regional taste preferences and catalog availability. We will show how models that are built on simple Maximum Likelihood principle fail to do that. We will then describe one solution that we have employed in order to enable the global deep learned models to focus their attention on capturing regional taste preferences and changing catalog.In the latter half of the talk, we will discuss how we do incremental learning of deep learned recommender system models. Why do we need to do that ? Everything changes with time. Users’ tastes change with time. What’s available on Netflix and what’s popular also change over time. Therefore, updating or improving recommendation systems over time is necessary to bring more joy to users. In addition to how we apply incremental learning, we will discuss some of the challenges we face involving large-scale data preparation, infrastructure setup for incremental model training as well as pipeline scheduling. The incremental training enables us to serve fresher models trained on fresher and larger amounts of data. This helps our recommender system to nicely and quickly adapt to catalog and users’ taste changes, and improve overall performance.

Published in: Technology
  • Be the first to comment

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model

  1. 1. Building an Incrementally Trained Global Deep Learning Recommender System Model Anoop Deoras, Ko-Jen (Mark) Hsiao adeoras@netflix.com MLConf, San Francisco 11/08/2019 @adeoras
  2. 2. ~150M Members, 190 Countries
  3. 3. ● Recommendation Systems are means to an end. ● Our primary goal: ○ Maximize Netflix member’s enjoyment of the selected show ■ Enjoyment integrated over time ○ Minimize the time it takes to find them ■ Interaction cost integrated over time Personalization ● Personalization
  4. 4. Everything is a recommendation!
  5. 5. Ordering of the titles in each row is personalized
  6. 6. Selection and placement of the row types is personalized
  7. 7. Profile 1 Profile 2 Personalized Images
  8. 8. Personalized Messages
  9. 9. IMPRACTICAL TO SHOW EVERYTHING
  10. 10. We Personalize our recommendation! This Talk Answers: HOW ?
  11. 11. Basic Intuition behind Collaborative Filtering ● Imagine you walked into a room full of movie enthusiasts, from all over the world, from all walks of life, and your goal was to come out with a great movie recommendation. ● Would you obtain popular vote ? Would that satisfy you ?
  12. 12. Basic Intuition behind Soft Clustering Models ● Now consider forming groups of people with similar taste based on the videos that they previously enjoyed.
  13. 13. Basic Intuition behind Soft Clustering Models ● Describe yourself using what you have watched. ● Try to associate yourself with these groups and obtain a weighted “personalized popularity vote”.
  14. 14. Distribution over the topics and over the videos 0.01 0.63 0.22 0.15
  15. 15. Topic Models (Latent Dirichlet Alloc) K U P α θ φt v β Total Topics Taste Convex Combinations of topics proportions and movie proportions within topic
  16. 16. OUR ALGORITHMS ARE GLOBAL AND THEY HELP LOCAL STORIES BE HEARD GLOBALLY
  17. 17. GLOBAL ALGORITHMS foster GLOBAL COMMUNITIES Thanks to Sudeep Das for contributing this beautiful slide.
  18. 18. Country Context in LDA models Users in Country A play both Friends and HIMYM Users in Country B cannot play both Friends and HIMYM Country A catalog Country B catalog Model is forced to split HIMYM plays. topic k Outcome: Parameters are being consumed to explain catalog differences. topic j Topic with high mass on Friends and HIMYM Topic with high mass on HIMYM Thanks to Ehtsham Elahi for contributing this slide.
  19. 19. Catalogue Censoring in Topic Models K U P α θ φt v β Total Topics Taste c Censoring pattern m Global Recommendation System for Overlapping Media Catalogue, Todd et.al., US Patent App
  20. 20. ALGORITHMS NEED TO CAPTURE THE TREND
  21. 21. Time context in Topic Models K U P α θ φt v β Total Topics Taste m Observed time µ Topics over Time: A Non Markov Continuous-Time Model fo Topic Trends. , Wang et.al., KDD 2006
  22. 22. Fully contextualizing Topic Models K U P α θ φt v β Total Topics Taste m Observed time µ c Censoring pattern m
  23. 23. SIMPLE !
  24. 24. IMPRACTICAL TO SCALE
  25. 25. Differentiation manual Time Consuming Poor Scaling Symbolic Time Efficient Excellent for scaling Gift of Deep Learning: Automatic Differentiation
  26. 26. Variational Autoencoders zu u Taste fθ 𝞵 𝞼 u Encoder Decoder fѰ fѰ DNN Soft-max over entire vocabulary Variational Autoencoders for Collaborative Filtering, Liang et al. WWW (2018)
  27. 27. ReLU ReLU User Representation Feed Forward Country Catalogue Country ● Create a censored mask with out of catalogue videos ● Mask the output layer (logits) ● Use the masked layer for cross entropy loss. How to do country contextual modeling ?
  28. 28. ReLU ReLU User Representation Country Feed Forward Country Catalogue Save Model Energy in Learning Catalogue Differences
  29. 29. ReLU ReLU User Representation Country Feed Forward Adding Time is Easy too Time at Serving
  30. 30. Time to train is large ! Catalogue changes quite fast DL Model P( | U, C, T) Cannot estimate as not in our model’s vocabulary
  31. 31. Incrementally Train the Models time Fully Trained Model Additional Nodes and Parameters
  32. 32. RECIPE 1. CENSOR 2. ADD CONTEXT VARIABLES TO THE MODEL 3. DO; EVERY FEW DAYS a. TRAIN A WARM START MODEL WITH (1 & 2) 4. DO; EVERY FEW HOURS a. TAKE THE MODEL FROM (3) b. ADD NEW EMBEDDINGS c. ADD NEW PARAMETERS d. FINE TUNE
  33. 33. THANK YOU ! Questions ? Anoop Deoras, Ko-Jen (Mark) Hsiao adeoras@netflix.com @adeoras Sincere thanks to a lot of my Netflix Colleagues: Aish Fenton, Dawen Liang and Ehstham Elahi for contributing to the ideas discussed here.

×