SlideShare a Scribd company logo
Large-Scale Recommendation Systems Workshop
RecSys 2013, Hong Kong

Recommendation at Netflix Scale
Justin Basilico
Netflix Algorithm Engineering

October 13, 2013
1
Outline

Reintroduction to
Netflix

Approach to
Recommendation

Netflix Scale

Architecture

2
Reintroduction to Netflix
3
4
Change of focus

2006

2013

5
Approach to Recommendation
6
Goal

Help members find content to watch and enjoy
to maximize member satisfaction and retention

7
Everything is a Recommendation

Rows

Ranking

Over 75% of what
people watch comes
from our
recommendations

8
Top 10: Our best guess
Personalization awareness

All

Dad

Dad&Mom Daughter

All

All?

Daughter

Son

Mom

Mom

Diversity
9
But…

10
Genre Personalization
 Personalized genre rows
focus on user interest
 Also provide context and
“evidence”

 How are they generated?
 Implicit: based on user’s recent
plays, ratings, & other
interactions
 Explicit taste preferences
 Hybrid: combine the above

 Also take into account:
 Freshness - has this been
shown before?
 Diversity– avoid repeating tags
and genres, limit number of TV
genres, etc.

11
Similars
 Displayed in many
contexts
 Video display page
 In response to user
actions
(search, queue
add, …)
 “Because you
watched” rows

12
Support for Recommendations

Social Support

13
EVERYTHING is a Recommendation

14
… EVERYTHING

15
Netflix Scale
16
Netflix Data
 > 37M members
 > 40 countries

 > 1000 device types
 Ratings: > 4M/day
 Searches: > 3M/day

 Plays: > 30M/day
 1B hours in June 2012
 > 4B hours in Q1 2013
 Log 100B events/day
 32.25% of peak US downstream
traffic
17
Plays
●

What people watch

●

The most important source of data for
our algorithms

●

A few plays are usually more valuable
than most of our other data

●

We have a lot of information
associated to a play:
○

Duration

○

Start/stop/pause/rewind

○

Device, location, time, …

○

Page context

○

…

18
Ratings


Explicit information about a member’s taste
should be great



But we find ratings are…



Noisy





Sparse
Biased

Quality of our ratings has decreased over
time
19
Metadata
●

Our tag space is made of thousands of
different concepts

●

Manually annotated by a set of experts

●

Although an automatic approach may be
possible, we believe it would be of lesser
quality
○

●

However, we are researching on automatic
annotation of scenes, transitions…

Metadata is useful
○

Especially for coldstart
20
Social
●

Can your “friends” interests help us predict
yours better?

●

The answer is similar to the Metadata case:
○

○

●

If we know enough about you, social information
becomes less useful
But, it is very interesting for coldstarting

Social support for recommendations has been
shown to matter

21
Affordances
 Highly curated catalog
 Catalog changes daily
 Videos have long shelf-lives
 Videos take time to consume

22
Smart Models


Logistic/linear regression



Elastic nets



SVD and other Matrix Factorizations



Restricted Boltzmann Machines



Deep Networks



Factorization Machines



Markov Chains



Different clustering approaches



Latent Dirichlet Allocation



Gradient Boosted Decision
Trees/Random Forests



…
23
Offline/Online testing process
Weeks to months

days

Offline
testing

[success]

Online A/B
testing

[success]

Rollout
Feature to
all users

[fail]

24
System Architecture
25
Design Considerations

Recommendations

Systems

• Personal
• Accurate
• Novel
• Diverse
• Fresh

• Scalable
• Responsive
• Resilient
• Efficient
• Flexible

26
Technology Stack

http://techblog.netflix.com

27
Cloud Computing at Netflix
 Layered services
 Clusters: Horizontal scaling
 Auto-scale with demand
 Plan for failure
 Replication
 Fail fast

 State is bad

 Simian Army: Induce failures to
ensure resiliency
28
System Overview

OFFLINE
Netflix.Hermes

Query results

 Blueprint for multiple
personalization algorithm
services
 Ranking
 Row selection

Offline Data
Machine
Learning
Algorithm

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Ratings

User Event
Queue

 …

 Recommendation involving
multi-layered Machine
Learning

Model
training

Event Distribution

Algorithm
Service

Online
Data Service

UI Client

ONLINE

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

29
OFFLINE
Netflix.Hermes

Query results

Offline Data

Event & Data Distribution

Machine
Learning
Algorithm

Netflix.Manhattan

 Collect actions

Machine
Learning
Algorithm

User Event
Queue

Algorithm
Service

Online
Data Service

UI Client

User Event
Queue

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

Event Distribution

 Small units

 Data

Models

Netflix.Manhattan

Event Distribution

 Plays, browsing, searches, ratin
gs, etc.

 Time sensitive

Nearline
Computation

NEARLINE

ONLINE

 Events

Model
training

Offline
Computation

UI Client

Play, Rate,
Browse...

 Dense information
 Processed for further use
 Saved

Member

30
Computation Layers

OFFLINE
Netflix.Hermes

 Offline

Offline Data
Models

Offline
Computation

 Process data

 Nearline

Nearline
Computation

NEARLINE

Machine
Learning
Algorithm

Netflix.Manhattan

 Process events

 Online
 Process requests

ONLINE

Algorithm
Service

Online Data
Service

UI Client

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

31
OFFLINE
Netflix.Hermes

Query results

Offline Data

Online Computation

Machine
Learning
Algorithm

Model
training

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Synchronous computation in
response to a member request

 Pros:

 Good for:

User Event
Queue

Event Distribution

Algorithm
Service

Online
Data Service

UI Client

 Simple algorithms

ONLINE

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

 Model application

 Access to most fresh data

 Business logic

 Knowledge of full request context

 Context-dependence

 Compute only what is necessary

 Interactivity

Online
Data Service

 Cons:
 Strict Service Level Agreements
 Must respond quickly … in all cases
 Requires high availability

 Limited view of data

Event Distribution

Algorithm
Service
UI Client

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

www.netflix.com
Member

32
OFFLINE
Netflix.Hermes

Query results

Offline Data

Offline Computation

Machine
Learning
Algorithm

Model
training

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Asynchronous computation done
on a regular schedule

 Good for:

User Event
Queue

Event Distribution

Algorithm
Service

Online
Data Service

UI Client

ONLINE

 Batch learning

 Pros:

Play, Rate,
Browse...

Online
Computation

Recommendations

Machine
Learning
Algorithm
Member

 Model training

 Can handle large data

 Complex algorithms

 Can do bulk processing

 Precomputing

 Relaxed time constraints

 Cons:

Query results

Netflix.Hermes

Model
training

Machine
Learning
Algorithm

 Cannot react quickly

 Results can become stale

Models

Offline Data

Offline
Computation

Machine
Learning
Algorithm

33
OFFLINE
Netflix.Hermes

Query results

Offline Data

Nearline Computation

Machine
Learning
Algorithm

Model
training

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Asynchronous computation in
response to a member event

 Pros:

 Good for:

User Event
Queue

Event Distribution

Algorithm
Service

Online
Data Service

UI Client

 Incremental learning

ONLINE

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

 User-oriented algorithms

 Can keep data fresh

 Moderate complexity algorithms

 Can run moderate complexity
algorithms

 Keeping precomputed results
fresh

 Can average computational cost
across users

Nearline
Computation

 Change from actions

 Cons:

Machine
Learning
Algorithm

Netflix.Manhattan

 Has some delay
 Done in event context

User Event
Queue
34
Where to place components?
 Example: Matrix Factorization
 Offline:
 Collect sample of play data
 Run batch learning algorithm to
produce factorization
 Publish item factors

 Nearline:
 Solve user factors
 Compute user-item products
 Combine

 Online:
 Presentation-context filtering
 Serve recommendations

OFFLINE

X

Netflix.Hermes

Query results

Offline Data
Machine
Learning
Algorithm

Model

X≈UVt
training

Offline
Computation

sNearline j
ij=uiv

NEARLINE

V

Models

Machine
Learning
Algorithm

Aui=b

Computation

Netflix.Manhattan

sij

User Event
Queue

Event Distribution

sij>t

Algorithm
Service

Online
Data Service

UI Client

ONLINE

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

35
Netflix Manhattan

Stan Lanning

 Event-based precomputation framework
 Supports both nearline and offline computation modes

 Customer-centric events and data
Play
Service
Rating
Service

Event
Queue

Event
Event
Event
Handler
Handler
Handler

Request
Queue

…
Event
Rules

Manager
Manager
Manager
Algorithm
Algorithm
Algorithm

Cached
User Data
36
OFFLINE
Netflix.Hermes

Query results

Offline Data

Signals & Models

Machine
Learning
Algorithm

Model
training

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Similar pattern across layers

User Event
Queue

Offline Data

Event Distribution

Algorithm
Service

Online
Data Service

UI Client

ONLINE

 Models

 Previously processed and
stored information

Online
Computation
Machine
Learning
Algorithm

Netflix.Hermes

Offline
Computation
Nearline
Computation

Models

Machine
Learning
Algorithm

Online
Computation

 Signals
 Fresh data from live services
 User-related or context-related

Recommendations

Member

 Parameter files
 Trained offline

 Data

Play, Rate,
Browse...

Signals
(Online Service)

Machine
Learning
Algorithm

37
OFFLINE
Netflix.Hermes

Query results

Offline Data

Recommendation Results

Machine
Learning
Algorithm

Model
training

Offline
Computation
Nearline
Computation

NEARLINE

Models

Machine
Learning
Algorithm

Netflix.Manhattan

 Precomputed results

User Event
Queue

Event Distribution

ONLINE

 Fetch from data store

 Collect signals, apply model

 Combination

 Dynamically choose

Online
Data Service

Play, Rate,
Browse...

Recommendations

Online
Computation
Machine
Learning
Algorithm

Member

 Post-process in context

 Generated on the fly

Algorithm
Service
UI Client

Algorithm
Service

Machine
Learning
Algorithm

Online
Computation

UI Client

Recommendations

 Fallbacks
Member

38
Conclusions
39
Research Directions

Personalized
learning to rank

Context
awareness

Presentation
effects

Social
recommendation

Full-page
optimization

Cold start

40
Take Aways
 Behind-the-scenes peek at a real-world, industrial-scale
recommender system

 Recommendation is not just ratings
 Scaling is not only about batch, offline algorithms

 Use application domain advantages

41
We’re hiring

Thank You

Justin Basilico
42
@JustinBasilico

More Related Content

What's hot

Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
Yves Raimond
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Alexandros Karatzoglou
 
Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018
Fernando Amat
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Sudeep Das, Ph.D.
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Learning to Personalize
Learning to PersonalizeLearning to Personalize
Learning to Personalize
Justin Basilico
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
Justin Basilico
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
Justin Basilico
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
Faisal Siddiqi
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
Chris Johnson
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
Zachary Schendel
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Justin Basilico
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Blake Irvine
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
Linas Baltrunas
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 

What's hot (20)

Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018 Artwork Personalization at Netflix Fernando Amat RecSys2018
Artwork Personalization at Netflix Fernando Amat RecSys2018
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Learning to Personalize
Learning to PersonalizeLearning to Personalize
Learning to Personalize
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 

Similar to Recommendation at Netflix Scale

Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems
Axel de Romblay
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
Neo4j
 
Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation Systems
Axel de Romblay
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021
DSDT_MTL
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
Justin Basilico
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
MLconf
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
Justin Basilico
 
Machine Learning at Netflix Scale
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix Scale
Aish Fenton
 
AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)
Amazon Web Services
 
The Need for Speed
The Need for SpeedThe Need for Speed
The Need for Speed
Capgemini
 
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Looker
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
Jadna Almeida
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
Jadna Almeida
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Cloudera, Inc.
 
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launchLean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
People10 Technosoft Private Limited
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
Grega Kespret
 
Deep-Dive: Predicting Customer Behavior with Apigee Insights
Deep-Dive: Predicting Customer Behavior with Apigee InsightsDeep-Dive: Predicting Customer Behavior with Apigee Insights
Deep-Dive: Predicting Customer Behavior with Apigee Insights
Apigee | Google Cloud
 
Data Science in E-commerce
Data Science in E-commerceData Science in E-commerce
Data Science in E-commerce
Vincent Michel
 

Similar to Recommendation at Netflix Scale (20)

Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation Systems
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Machine Learning at Netflix Scale
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix Scale
 
AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)
 
The Need for Speed
The Need for SpeedThe Need for Speed
The Need for Speed
 
Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...Creating a Single Source of Truth: Leverage all of your data with powerful an...
Creating a Single Source of Truth: Leverage all of your data with powerful an...
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
 
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launchLean Startup: Reduce 40% go-to-market time & cost on your next product launch
Lean Startup: Reduce 40% go-to-market time & cost on your next product launch
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Deep-Dive: Predicting Customer Behavior with Apigee Insights
Deep-Dive: Predicting Customer Behavior with Apigee InsightsDeep-Dive: Predicting Customer Behavior with Apigee Insights
Deep-Dive: Predicting Customer Behavior with Apigee Insights
 
Data Science in E-commerce
Data Science in E-commerceData Science in E-commerce
Data Science in E-commerce
 

Recently uploaded

Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 

Recently uploaded (20)

Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 

Recommendation at Netflix Scale

  • 1. Large-Scale Recommendation Systems Workshop RecSys 2013, Hong Kong Recommendation at Netflix Scale Justin Basilico Netflix Algorithm Engineering October 13, 2013 1
  • 4. 4
  • 7. Goal Help members find content to watch and enjoy to maximize member satisfaction and retention 7
  • 8. Everything is a Recommendation Rows Ranking Over 75% of what people watch comes from our recommendations 8
  • 9. Top 10: Our best guess Personalization awareness All Dad Dad&Mom Daughter All All? Daughter Son Mom Mom Diversity 9
  • 11. Genre Personalization  Personalized genre rows focus on user interest  Also provide context and “evidence”  How are they generated?  Implicit: based on user’s recent plays, ratings, & other interactions  Explicit taste preferences  Hybrid: combine the above  Also take into account:  Freshness - has this been shown before?  Diversity– avoid repeating tags and genres, limit number of TV genres, etc. 11
  • 12. Similars  Displayed in many contexts  Video display page  In response to user actions (search, queue add, …)  “Because you watched” rows 12
  • 14. EVERYTHING is a Recommendation 14
  • 17. Netflix Data  > 37M members  > 40 countries  > 1000 device types  Ratings: > 4M/day  Searches: > 3M/day  Plays: > 30M/day  1B hours in June 2012  > 4B hours in Q1 2013  Log 100B events/day  32.25% of peak US downstream traffic 17
  • 18. Plays ● What people watch ● The most important source of data for our algorithms ● A few plays are usually more valuable than most of our other data ● We have a lot of information associated to a play: ○ Duration ○ Start/stop/pause/rewind ○ Device, location, time, … ○ Page context ○ … 18
  • 19. Ratings  Explicit information about a member’s taste should be great  But we find ratings are…   Noisy   Sparse Biased Quality of our ratings has decreased over time 19
  • 20. Metadata ● Our tag space is made of thousands of different concepts ● Manually annotated by a set of experts ● Although an automatic approach may be possible, we believe it would be of lesser quality ○ ● However, we are researching on automatic annotation of scenes, transitions… Metadata is useful ○ Especially for coldstart 20
  • 21. Social ● Can your “friends” interests help us predict yours better? ● The answer is similar to the Metadata case: ○ ○ ● If we know enough about you, social information becomes less useful But, it is very interesting for coldstarting Social support for recommendations has been shown to matter 21
  • 22. Affordances  Highly curated catalog  Catalog changes daily  Videos have long shelf-lives  Videos take time to consume 22
  • 23. Smart Models  Logistic/linear regression  Elastic nets  SVD and other Matrix Factorizations  Restricted Boltzmann Machines  Deep Networks  Factorization Machines  Markov Chains  Different clustering approaches  Latent Dirichlet Allocation  Gradient Boosted Decision Trees/Random Forests  … 23
  • 24. Offline/Online testing process Weeks to months days Offline testing [success] Online A/B testing [success] Rollout Feature to all users [fail] 24
  • 26. Design Considerations Recommendations Systems • Personal • Accurate • Novel • Diverse • Fresh • Scalable • Responsive • Resilient • Efficient • Flexible 26
  • 28. Cloud Computing at Netflix  Layered services  Clusters: Horizontal scaling  Auto-scale with demand  Plan for failure  Replication  Fail fast  State is bad  Simian Army: Induce failures to ensure resiliency 28
  • 29. System Overview OFFLINE Netflix.Hermes Query results  Blueprint for multiple personalization algorithm services  Ranking  Row selection Offline Data Machine Learning Algorithm Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Ratings User Event Queue  …  Recommendation involving multi-layered Machine Learning Model training Event Distribution Algorithm Service Online Data Service UI Client ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member 29
  • 30. OFFLINE Netflix.Hermes Query results Offline Data Event & Data Distribution Machine Learning Algorithm Netflix.Manhattan  Collect actions Machine Learning Algorithm User Event Queue Algorithm Service Online Data Service UI Client User Event Queue Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member Event Distribution  Small units  Data Models Netflix.Manhattan Event Distribution  Plays, browsing, searches, ratin gs, etc.  Time sensitive Nearline Computation NEARLINE ONLINE  Events Model training Offline Computation UI Client Play, Rate, Browse...  Dense information  Processed for further use  Saved Member 30
  • 31. Computation Layers OFFLINE Netflix.Hermes  Offline Offline Data Models Offline Computation  Process data  Nearline Nearline Computation NEARLINE Machine Learning Algorithm Netflix.Manhattan  Process events  Online  Process requests ONLINE Algorithm Service Online Data Service UI Client Recommendations Online Computation Machine Learning Algorithm Member 31
  • 32. OFFLINE Netflix.Hermes Query results Offline Data Online Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Synchronous computation in response to a member request  Pros:  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client  Simple algorithms ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  Model application  Access to most fresh data  Business logic  Knowledge of full request context  Context-dependence  Compute only what is necessary  Interactivity Online Data Service  Cons:  Strict Service Level Agreements  Must respond quickly … in all cases  Requires high availability  Limited view of data Event Distribution Algorithm Service UI Client Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm www.netflix.com Member 32
  • 33. OFFLINE Netflix.Hermes Query results Offline Data Offline Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Asynchronous computation done on a regular schedule  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client ONLINE  Batch learning  Pros: Play, Rate, Browse... Online Computation Recommendations Machine Learning Algorithm Member  Model training  Can handle large data  Complex algorithms  Can do bulk processing  Precomputing  Relaxed time constraints  Cons: Query results Netflix.Hermes Model training Machine Learning Algorithm  Cannot react quickly  Results can become stale Models Offline Data Offline Computation Machine Learning Algorithm 33
  • 34. OFFLINE Netflix.Hermes Query results Offline Data Nearline Computation Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Asynchronous computation in response to a member event  Pros:  Good for: User Event Queue Event Distribution Algorithm Service Online Data Service UI Client  Incremental learning ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  User-oriented algorithms  Can keep data fresh  Moderate complexity algorithms  Can run moderate complexity algorithms  Keeping precomputed results fresh  Can average computational cost across users Nearline Computation  Change from actions  Cons: Machine Learning Algorithm Netflix.Manhattan  Has some delay  Done in event context User Event Queue 34
  • 35. Where to place components?  Example: Matrix Factorization  Offline:  Collect sample of play data  Run batch learning algorithm to produce factorization  Publish item factors  Nearline:  Solve user factors  Compute user-item products  Combine  Online:  Presentation-context filtering  Serve recommendations OFFLINE X Netflix.Hermes Query results Offline Data Machine Learning Algorithm Model X≈UVt training Offline Computation sNearline j ij=uiv NEARLINE V Models Machine Learning Algorithm Aui=b Computation Netflix.Manhattan sij User Event Queue Event Distribution sij>t Algorithm Service Online Data Service UI Client ONLINE Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member 35
  • 36. Netflix Manhattan Stan Lanning  Event-based precomputation framework  Supports both nearline and offline computation modes  Customer-centric events and data Play Service Rating Service Event Queue Event Event Event Handler Handler Handler Request Queue … Event Rules Manager Manager Manager Algorithm Algorithm Algorithm Cached User Data 36
  • 37. OFFLINE Netflix.Hermes Query results Offline Data Signals & Models Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Similar pattern across layers User Event Queue Offline Data Event Distribution Algorithm Service Online Data Service UI Client ONLINE  Models  Previously processed and stored information Online Computation Machine Learning Algorithm Netflix.Hermes Offline Computation Nearline Computation Models Machine Learning Algorithm Online Computation  Signals  Fresh data from live services  User-related or context-related Recommendations Member  Parameter files  Trained offline  Data Play, Rate, Browse... Signals (Online Service) Machine Learning Algorithm 37
  • 38. OFFLINE Netflix.Hermes Query results Offline Data Recommendation Results Machine Learning Algorithm Model training Offline Computation Nearline Computation NEARLINE Models Machine Learning Algorithm Netflix.Manhattan  Precomputed results User Event Queue Event Distribution ONLINE  Fetch from data store  Collect signals, apply model  Combination  Dynamically choose Online Data Service Play, Rate, Browse... Recommendations Online Computation Machine Learning Algorithm Member  Post-process in context  Generated on the fly Algorithm Service UI Client Algorithm Service Machine Learning Algorithm Online Computation UI Client Recommendations  Fallbacks Member 38
  • 40. Research Directions Personalized learning to rank Context awareness Presentation effects Social recommendation Full-page optimization Cold start 40
  • 41. Take Aways  Behind-the-scenes peek at a real-world, industrial-scale recommender system  Recommendation is not just ratings  Scaling is not only about batch, offline algorithms  Use application domain advantages 41
  • 42. We’re hiring Thank You Justin Basilico 42 @JustinBasilico

Editor's Notes

  1. http://www.businessweek.com/articles/2013-05-09/netflix-reed-hastings-survive-missteps-to-join-silicon-valleys-elite#p5
  2. http://techblog.netflix.com/2013/03/system-architectures-for.html