Netflix

6,779 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,779
On SlideShare
0
From Embeds
0
Number of Embeds
4,246
Actions
Shares
0
Downloads
54
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Netflix

  1. 1. The Cinematch System: Operation, Scale Coverage, Accuracy Impact Jim Bennett 9/13/06
  2. 2. What Is Netflix? • “Connecting people to the movies they love” • Online DVD movie rental: – Users subscribe for a fixed fee per month • Plans define #movies out at once, #turns in a month – Find, then queue up movies on website – USPS delivers DVDs within 1 business day most areas – Keep as long as you want; no late fees – Return in pre-paid mailer when done – Next DVD on your queue sent automatically • Working on movie delivery over the net • Choice of 65,000 titles…which ones?
  3. 3. Give Ratings Get Recommendations
  4. 4. Show Interest Get Recommendations
  5. 5. Netflix and Cinematch Scale • 5M active customers – Ship 1.4M disks per day from 40 locations • 1.4B ratings since 1997 – 2M ratings per day – 1B predictions per day • Item-to-item analysis with many data- conditioning heuristics • 2 days to retrain on new ratings • Manual item setup for “coldstart” titles – Automatically retired
  6. 6. Cinematch Operation
  7. 7. Ratings distribution Netflix starts DVD rentals Wizard of Oz Gone with the Wind
  8. 8. Ratings distribution Silent B&W Color
  9. 9. Ye 1000 2000 3000 4000 5000 6000 7000 8000 0 ar 19 08 19 13 19 17 19 21 19 25 19 29 19 33 19 37 19 41 19 45 19 49 19 53 19 57 19 61 19 65 19 69 19 73 20K predictees (30%) 19 77 19 81 19 85 19 89 19 93 19 97 Predictive Coverage 20 01 Total Predictees
  10. 10. 6000 5000 3000 4000 2000 1000 0 Music & Musicals Foreign Drama * Popular = top 10K by ratings Documentary Children & Family Comedy Television Classics Sports Action & Adventure Horror Special Interest Thrillers Anime & Animation Sci-Fi & Fantasy Romance Independent Gay & Lesbian Popular Predictable Films by Genre Total Popular Predictable
  11. 11. Climbing Mount Predictable Predictable movies 9000 8000 7000 6000 Shooting stars 5000 4 and 5 stars # movies Predictablybad (<3) 4000 Predictable 3000 2000 1000 0 0 25 50 75 100 150 200 300 400 500 600 700 800 900 1000 10000 # user ratings
  12. 12. Prediction Accuracy Error as user ratings increase 1.4 1.2 1 0.8 RMSE +/- Stars 0.6 MAE Bias 0.4 0.2 0 <=5 <=10 <=20 <=50 <=100 <=200 <=300 <=500 >500 -0.2
  13. 13. Error by Confidence Error as confidence increases 1.2 1 0.8 0.6 RMSE +/- Stars MAE 0.4 Bias 0.2 0 Average 0 1 2 3 -0.2
  14. 14. Does It Matter? • Absolutely critical to retaining users – As CM has improved and RMSE has fallen, the percentage of 4-5 star movies rented has increased • Important to users: – There are only so many new releases – Help jog memories about movies to see – CM reflects the collective memory of good movies
  15. 15. Does It Matter? Cinematch-based User
  16. 16. What’s Next? • Anticipate scale of 20M subscribers in 2010-2012 – Nearly 10B ratings, 10M/day – 5B predictions/day • Improved learning algorithms – Improve coverage, accuracy and learning speed • Help the non-rater • Explore getting movie tastes beyond ratings • Encode traits of movies that predict emotional response • Motivate a user to take an unknown but likely great movie

×