Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Large-Scale Recommendation Systems Workshop
RecSys 2013, Hong Kong

Recommendation at Netflix Scale
Justin Basilico
Netfli...
Outline

Reintroduction to
Netflix

Approach to
Recommendation

Netflix Scale

Architecture

2
Reintroduction to Netflix
3
4
Change of focus

2006

2013

5
Approach to Recommendation
6
Goal

Help members find content to watch and enjoy
to maximize member satisfaction and retention

7
Everything is a Recommendation

Rows

Ranking

Over 75% of what
people watch comes
from our
recommendations

8
Top 10: Our best guess
Personalization awareness

All

Dad

Dad&Mom Daughter

All

All?

Daughter

Son

Mom

Mom

Diversit...
But…

10
Genre Personalization
 Personalized genre rows
focus on user interest
 Also provide context and
“evidence”

 How are th...
Similars
 Displayed in many
contexts
 Video display page
 In response to user
actions
(search, queue
add, …)
 “Because...
Support for Recommendations

Social Support

13
EVERYTHING is a Recommendation

14
… EVERYTHING

15
Netflix Scale
16
Netflix Data
 > 37M members
 > 40 countries

 > 1000 device types
 Ratings: > 4M/day
 Searches: > 3M/day

 Plays: > ...
Plays
●

What people watch

●

The most important source of data for
our algorithms

●

A few plays are usually more valua...
Ratings


Explicit information about a member’s taste
should be great



But we find ratings are…



Noisy





Spar...
Metadata
●

Our tag space is made of thousands of
different concepts

●

Manually annotated by a set of experts

●

Althou...
Social
●

Can your “friends” interests help us predict
yours better?

●

The answer is similar to the Metadata case:
○

○
...
Affordances
 Highly curated catalog
 Catalog changes daily
 Videos have long shelf-lives
 Videos take time to consume
...
Smart Models


Logistic/linear regression



Elastic nets



SVD and other Matrix Factorizations



Restricted Boltzma...
Offline/Online testing process
Weeks to months

days

Offline
testing

[success]

Online A/B
testing

[success]

Rollout
F...
System Architecture
25
Design Considerations

Recommendations

Systems

• Personal
• Accurate
• Novel
• Diverse
• Fresh

• Scalable
• Responsive
...
Technology Stack

http://techblog.netflix.com

27
Cloud Computing at Netflix
 Layered services
 Clusters: Horizontal scaling
 Auto-scale with demand
 Plan for failure
...
System Overview

OFFLINE
Netflix.Hermes

Query results

 Blueprint for multiple
personalization algorithm
services
 Ranki...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Event & Data Distribution

Machine
Learning
Algorithm

Netflix.Manhattan...
Computation Layers

OFFLINE
Netflix.Hermes

 Offline

Offline Data
Models

Offline
Computation

 Process data

 Nearline

...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Online Computation

Machine
Learning
Algorithm

Model
training

Offline
...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Offline Computation

Machine
Learning
Algorithm

Model
training

Offline...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Nearline Computation

Machine
Learning
Algorithm

Model
training

Offlin...
Where to place components?
 Example: Matrix Factorization
 Offline:
 Collect sample of play data
 Run batch learning a...
Netflix Manhattan

Stan Lanning

 Event-based precomputation framework
 Supports both nearline and offline computation m...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Signals & Models

Machine
Learning
Algorithm

Model
training

Offline
Co...
OFFLINE
Netflix.Hermes

Query results

Offline Data

Recommendation Results

Machine
Learning
Algorithm

Model
training

Offl...
Conclusions
39
Research Directions

Personalized
learning to rank

Context
awareness

Presentation
effects

Social
recommendation

Full-p...
Take Aways
 Behind-the-scenes peek at a real-world, industrial-scale
recommender system

 Recommendation is not just rat...
We’re hiring

Thank You

Justin Basilico
42
@JustinBasilico
Upcoming SlideShare
Loading in …5
×

Recommendation at Netflix Scale

10,578 views

Published on

Talk from the Large-Scale Recommendation Systems workshop at RecSys 2013 in Hong Kong.

Published in: Technology, Education

Recommendation at Netflix Scale

  1. 1. Large-Scale Recommendation Systems Workshop RecSys 2013, Hong Kong Recommendation at Netflix Scale Justin Basilico Netflix Algorithm Engineering October 13, 2013 1
  2. 2. Outline Reintroduction to Netflix Approach to Recommendation Netflix Scale Architecture 2
  3. 3. Reintroduction to Netflix 3
  4. 4. 4
  5. 5. Change of focus 2006 2013 5
  6. 6. Approach to Recommendation 6
  7. 7. Goal Help members find content to watch and enjoy to maximize member satisfaction and retention 7
  8. 8. Everything is a Recommendation Rows Ranking Over 75% of what people watch comes from our recommendations 8
  9. 9. Top 10: Our best guess Personalization awareness All Dad Dad&Mom Daughter All All? Daughter Son Mom Mom Diversity 9
  10. 10. But… 10
  11. 11. Genre Personalization  Personalized genre rows focus on user interest  Also provide context and “evidence”  How are they generated?  Implicit: based on user’s recent plays, ratings, & other interactions  Explicit taste preferences  Hybrid: combine the above  Also take into account:  Freshness - has this been shown before?  Diversity– avoid repeating tags and genres, limit number of TV genres, etc. 11
  12. 12. Similars  Displayed in many contexts  Video display page  In response to user actions (search, queue add, …)  “Because you watched” rows 12
  13. 13. Support for Recommendations Social Support 13
  14. 14. EVERYTHING is a Recommendation 14
  15. 15. … EVERYTHING 15
  16. 16. Netflix Scale 16
  17. 17. Netflix Data  > 37M members  > 40 countries  > 1000 device types  Ratings: > 4M/day  Searches: > 3M/day  Plays: > 30M/day  1B hours in June 2012  > 4B hours in Q1 2013  Log 100B events/day  32.25% of peak US downstream traffic 17
  18. 18. Plays ● What people watch ● The most important source of data for our algorithms ● A few plays are usually more valuable than most of our other data ● We have a lot of information associated to a play: ○ Duration ○ Start/stop/pause/rewind ○ Device, location, time, … ○ Page context ○ … 18
  19. 19. Ratings  Explicit information about a member’s taste should be great  But we find ratings are…   Noisy   Sparse Biased Quality of our ratings has decreased over time 19
  20. 20. Metadata ● Our tag space is made of thousands of different concepts ● Manually annotated by a set of experts ● Although an automatic approach may be possible, we believe it would be of lesser quality ○ ● However, we are researching on automatic annotation of scenes, transitions… Metadata is useful ○ Especially for coldstart 20
  21. 21. Social ● Can your “friends” interests help us predict yours better? ● The answer is similar to the Metadata case: ○ ○ ● If we know enough about you, social information becomes less useful But, it is very interesting for coldstarting Social support for recommendations has been shown to matter 21
  22. 22. Affordances  Highly curated catalog  Catalog changes daily  Videos have long shelf-lives  Videos take time to consume 22
  23. 23. Smart Models  Logistic/linear regression  Elastic nets  SVD and other Matrix Factorizations  Restricted Boltzmann Machines  Deep Networks  Factorization Machines  Markov Chains  Different clustering approaches  Latent Dirichlet Allocation  Gradient Boosted Decision Trees/Random Forests  … 23
  24. 24. Offline/Online testing process Weeks to months days Offline testing [success] Online A/B testing [success] Rollout Feature to all users [fail] 24
  25. 25. System Architecture 25
  26. 26. Design Considerations Recommendations Systems • Personal • Accurate • Novel • Diverse • Fresh • Scalable • Responsive • Resilient • Efficient • Flexible 26
  27. 27. Technology Stack http://techblog.netflix.com 27
  28. 28. Cloud Computing at Netflix  Layered services  Clusters: Horizontal scaling  Auto-scale with demand  Plan for failure  Replication  Fail fast  State is bad  Simian Army: Induce failures to ensure resiliency 28
  29. 29. System Overview OFFLINE Netflix.Hermes Query results  Blueprint for multiple personalization algorithm services  Ranking  Row selection Offline Data Machine Learning Algorithm Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Ratings User Event Queue  …  Recommendation involving multi-layered Machine Learning Model training Event Distribution Algorithm Service Online Data Service UI Client ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member 29
  30. 30. OFFLINE Netflix.Hermes Query results Offline Data Event & Data Distribution Machine Learning Algorithm Netflix.Manhattan  Collect actions Machine Learning Algorithm User Event Queue Algorithm Service Online Data Service UI Client User Event Queue Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member Event Distribution  Small units  Data Models Netflix.Manhattan Event Distribution  Plays, browsing, searches, ratin gs, etc.  Time sensitive Nearline Computation NEARLINE ONLINE  Events Model training Offline Computation UI Client Play, Rate, Browse...  Dense information  Processed for further use  Saved Member 30
  31. 31. Computation Layers OFFLINE Netflix.Hermes  Offline Offline Data Models Offline Computation  Process data  Nearline Nearline Computation NEARLINE Machine Learning Algorithm Netflix.Manhattan  Process events  Online  Process requests ONLINE Algorithm Service Online Data Service UI Client Recommendations Online Computation Machine Learning Algorithm Member 31
  32. 32. OFFLINE Netflix.Hermes Query results Offline Data Online Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Synchronous computation in response to a member request  Pros:  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client  Simple algorithms ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  Model application  Access to most fresh data  Business logic  Knowledge of full request context  Context-dependence  Compute only what is necessary  Interactivity Online Data Service  Cons:  Strict Service Level Agreements  Must respond quickly … in all cases  Requires high availability  Limited view of data Event Distribution Algorithm Service UI Client Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm www.netflix.com Member 32
  33. 33. OFFLINE Netflix.Hermes Query results Offline Data Offline Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Asynchronous computation done on a regular schedule  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client ONLINE  Batch learning  Pros: Play, Rate, Browse... Online Computation Recommendations Machine Learning Algorithm Member  Model training  Can handle large data  Complex algorithms  Can do bulk processing  Precomputing  Relaxed time constraints  Cons: Query results Netflix.Hermes Model training Machine Learning Algorithm  Cannot react quickly  Results can become stale Models Offline Data Offline Computation Machine Learning Algorithm 33
  34. 34. OFFLINE Netflix.Hermes Query results Offline Data Nearline Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Asynchronous computation in response to a member event  Pros:  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client  Incremental learning ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  User-oriented algorithms  Can keep data fresh  Moderate complexity algorithms  Can run moderate complexity algorithms  Keeping precomputed results fresh  Can average computational cost across users Nearline Computation  Change from actions  Cons: Machine Learning Algorithm Netflix.Manhattan  Has some delay  Done in event context User Event Queue 34
  35. 35. Where to place components?  Example: Matrix Factorization  Offline:  Collect sample of play data  Run batch learning algorithm to produce factorization  Publish item factors  Nearline:  Solve user factors  Compute user-item products  Combine  Online:  Presentation-context filtering  Serve recommendations OFFLINE X Netflix.Hermes Query results Offline Data Machine Learning Algorithm Model X≈UVt training Offline Computation sNearline j ij=uiv NEARLINE V Models Machine Learning Algorithm Aui=b Computation Netflix.Manhattan sij User Event Queue Event Distribution sij>t Algorithm Service Online Data Service UI Client ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member 35
  36. 36. Netflix Manhattan Stan Lanning  Event-based precomputation framework  Supports both nearline and offline computation modes  Customer-centric events and data Play Service Rating Service Event Queue Event Event Event Handler Handler Handler Request Queue … Event Rules Manager Manager Manager Algorithm Algorithm Algorithm Cached User Data 36
  37. 37. OFFLINE Netflix.Hermes Query results Offline Data Signals & Models Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Similar pattern across layers User Event Queue Offline Data Event Distribution Algorithm Service Online Data Service UI Client ONLINE  Models  Previously processed and stored information Online Computation Machine Learning Algorithm Netflix.Hermes Offline Computation Nearline Computation Models Machine Learning Algorithm Online Computation  Signals  Fresh data from live services  User-related or context-related Recommendations Member  Parameter files  Trained offline  Data Play, Rate, Browse... Signals (Online Service) Machine Learning Algorithm 37
  38. 38. OFFLINE Netflix.Hermes Query results Offline Data Recommendation Results Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Precomputed results User Event Queue Event Distribution ONLINE  Fetch from data store  Collect signals, apply model  Combination  Dynamically choose Online Data Service Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  Post-process in context  Generated on the fly Algorithm Service UI Client Algorithm Service Machine Learning Algorithm Online Computation UI Client Recommendations  Fallbacks Member 38
  39. 39. Conclusions 39
  40. 40. Research Directions Personalized learning to rank Context awareness Presentation effects Social recommendation Full-page optimization Cold start 40
  41. 41. Take Aways  Behind-the-scenes peek at a real-world, industrial-scale recommender system  Recommendation is not just ratings  Scaling is not only about batch, offline algorithms  Use application domain advantages 41
  42. 42. We’re hiring Thank You Justin Basilico 42 @JustinBasilico

×