At Netflix, we try to provide the best personalized video recommendations to our members. To do this, we need to adapt our recommendations for each contextual situation, which depends on information such as time or device. In this talk, I will describe how state of the art Contextual Recommendations are used at Netflix. A first example of contextual adaptation is the model that powers the Continue Watching row. It uses a feature-based approach with a carefully constructed training set to learn how to adapt to the context of the member. Next, I will dive into more modern approaches such as Tensor Factorization and LSTMs and share some results from deployments of these methods. I will highlight lessons learned and some common pitfalls of using these powerful methods in industrial scale systems. Finally, I will touch upon system reliability, choice of optimization metrics, hidden costs, risks and benefits of using highly adaptive systems.
16. Title Ranking Model
● P(titleX=continue_watch | current_time, current_device, some_play_happens)
Time, tPast Today
t1,iOS t2,web t3,web t4,iOS
? ?
17. ● P(titleX=continue_watch | current_time, current_device, some_play_happens)
● Construction of the data set and feature extraction is the key
● Model matters, but it is a secondary concern
Title Ranking Model
Time, tPast Today
t1,iOS t2,web t3,web t4,iOS
Continue Discovery
19. Feature Extraction
morning, web
morning, web
evening,iOS
evening,iOS
evening,iOS
Today at time t3, and web for continuation title Today at time t4, and iOs, for discovery title
23. ● Representation (Deep) Learning promises to do feature engineering
for you
● Time is a complex contextual dimension that needs special attention
● Time exhibits many periodicities
○ Daily
○ Weekly
○ Seasonally
○ … and even longer: Olympics, elections, etc.
● Generalizing to future behaviors through temporal extrapolation
Representation Learning
24. Sequence prediction
● Treat recommendations as a
sequence classification problem
○ Input: sequence of user actions
○ Output: next action
● E.g. Gru4Rec [Hidasi et. al., 2016]
○ Input: sequence of items in a
sessions
○ Output: next item in the session
25. Contextual sequence prediction
● Input: sequence of contextual user actions, plus
current context
● Output: probability of next action
● E.g. “Given all the actions a user has taken so far,
what’s the most likely video they’re going to play right
now?”
● e.g. [Smirnova & Vasile, 2017], [Beutel et. al., 2018]
32. The Price of Contextual Models
● Increased computational cost
○ Models can not be precomputed
● Modeling
○ Harder to build intuition
○ Higher time and memory complexity
○ Testing methodology is complicated
● Model gets stale easily
● Deep models can overfit offline metric
34. ● Be careful when splitting dataset
○ Don’t overfit the past
○ Predict the future
● May need to train/test at multiple distinct time points to see
generalization across time (e.g. [Lathia et. al., 2009])
● Not all offline metrics make sense for contextual
recommendations
Experimental Design
Train
Time
Test
35. Takeaways from Deep Learning
● Think beyond solving existing problems with new tools and instead
think what new problems the new tools can solve
● Deep Learning can work well for Recommendations...
○ When you go beyond the classic problem definition
○ Use more complex data such as contextual factors
● Lots of open areas to improve recommendations using deep
learning
36. ● Contextual signals can be as strong as personal preferences
○ Model them as such
○ Evaluate them as such
○ Make them central to your system and infrastructure
Final Note
38. Credits
Justin Basilico
Yves Raimond
Sudeep Das
Hossein Taghavi
and the whole Algorithm Engineering team
Read more in depth discussion on the topic:
● Other relevant presentations
● Blog post on continue watching model