Business Consulting Tech Talk
29th April 2021
www.abzooba.co
Netflix Recommender
System - A Big Data
Case Study
www.abzooba.co
• Netflix Introduction
• Overview of Netflix Recommender System
• Data sources, size and challenges
• Why this was a Big Data problem?
• How Netflix solved the Big Data problem?
•Technology Stack
•System Architecture & Computation
• Value added
www.abzooba.co
NETFLIX?
• DVD
• Streaming service
• Streaming through a subscription
model
• Wide variety of content
www.abzooba.co
Overview and Key Statistics:
• Netflix 2020 revenue : $25 billion
• Users : 21.5m (Q3 2011) to 203.6m (Q4 2020)
• India users : 2.4m (Q2 2020)
paying subscriber count
www.abzooba.co
Overview of Netflix Recommender System
Personalization
• Recommendation
• Titles
• Awareness
• Adapting to user's
preferences
• Similarities
• Similar suggestions
www.abzooba.co
Data Source
• Ratings
• Stream Data
• Queues
• Metadata
• Social Data
• External Data
• Demographics
www.abzooba.co
• ‘Cinematch’
• Dataset : 100,480,507 ratings that 480,189 users gave to 17,770 movies.
• 5 billion
Data Size
www.abzooba.co
Data Issues
Privacy Issues:
• 2007 - Researchers at the
University of Austin were able
to figure out the users in the
anonymous Netflix dataset by
matching their ratings on the
Internet Movie Database.
• 2009 - Lawsuit against Netflix
• 2010 – Netflix canceled the
competition.
www.abzooba.co
Why this is a Big Data problem?
• Volume
• Velocity
• Variety
• Veracity
www.abzooba.co
Volume:
• US Titles - 3,600 movies and 1,800 shows as of February 2021
• Migration to Cloud
• Videos : 105TB
• Dataset : 5b ratings
www.abzooba.co
Velocity:
• Usage Statistics
• 2 million hours
Veracity:
• Anomalies
• Netflix Prize Challenge
Variety:
• Data Format
• Thumbnails
www.abzooba.co
Technology Stack
www.abzooba.co
System Architecture
Computation Layers:
• Online
Process requests
• Offline
Process data
• Nearline
Process events
www.abzooba.co
Online Computation
• Synchronous computation in
respose to a member request
• Fresh Data
• Compute only what
is necessary
• Good for:
• Simple Algorithms
• Interactivity
www.abzooba.co
Offline Computation
• Asynchronous computation done on regular schedule
• Large Data
• Bulk processing
• Good for:
• Batch learning
• Model training
• Complex algorithms
• Precomputing
www.abzooba.co
Nearline Computation
• Asynchronous computation in
response to member event
• Fresh Data
• Average computation
• Change from actions
• Good for:
• Incremental learning
• Model training
• User-oriented algorithms
• Keeping precomputed results
www.abzooba.co
Value generated
• User Engagement Rate
• Money Saved:$1 billion a year
• Content: 75%
• Member Satisfaction
• Specific actions taken
• Code Revamp:5 bil
• Scale: 100 mil to 5 bil
www.abzooba.co
Thank you!
www.abzooba.co
www.abzooba.co
What Netflix wanted to achieve?
• What questions did they want to answer ?
• Data Sources, Size?
• Major challenge encountered ?
www.abzooba.co
Key Project Features
• Team/people involved
• Project duration
• Value added to the organization as a result
• Actions taken as a result of the project
www.abzooba.co
Team:
• 800 Netflix Engineers in Silicon Valley HQ
"BellKor’s Pragmatic Chaos" consisted
• Andreas Toscher
• Michael Jahrer (BigChaos)
• Robert Bell
• Chris Volinsky (AT&T)
• Yehuda Koren (Yahoo)
• Martin Piotte,
• Martin Chabbert (Pragmatic Theory)
Duration:
• 2000 Hours
• 107 algorithms
www.abzooba.co
Appendix
www.abzooba.co
Stakeholders
• Primary stakeholders : subscribers and viewers
• Secondary stakeholders : research team of Netflix
Competitors
• Amazon Prime Video
• Hulu
• Disney+
• Sony
• HBO Max
• Television Channels
• Cinema
• Piracy
www.abzooba.co
www.abzooba.co
www.abzooba.co
References
• Vanderbilt, T. (2018, June 22). The Science Behind the Netflix Algorithms That Decide What
You’ll Watch Next. Retrieved April 12, 2020
• Maddodi, S., & K, K. P. (2019). Netflix Bigdata Analytics- The Emergence of Data Driven
Recommendation. SSRN Electronic Journal. doi: 10.2139/ssrn.3473148
• Brodkin, J., & Utc. (2016, February 11). Netflix finishes its massive migration to the Amazon
cloud.
Retrieved April 12, 2020, from https://arstechnica.com/information-technology/2016/02/netflix-
finishes-its-massive-migration-to-the-amazon-cloud/
• Netflix. (n.d.). How Netflix’s Recommendations System Works. Retrieved April 12, 2020,
from https://help.netflix.com/en/node/100639

Netflix Recommender System : Big Data Case Study

Editor's Notes

  • #2 Hello world
  • #22 Hellow workjd hehlllwo lkjlskdajflkaws