Machine Learning at Netflix

•Download as PPTX, PDF•

2 likes•1,416 views

Domino Data Lab

Presented by Aish Fenton, Research Manager at Netflix. This was presented on September 14, 2016 at Data Pop Up LA.

Technology

1999 Started work on recommendations
(DVDs)
2006-2009 Launched Netflix Prize, 1M prize to improve
predicted ratings
2007-2011 Transitioned to streaming
2014 300+ people working on content
discovery, $150m per year investment.

The value of research?
● Business Value
● Consumer <-> Producer

How much is 0.1% worth?
83,000,000 members + modest growth
$10 * 12 months
---
~5-100+ million a year

If you want to build a ship, don't
drum up people to collect wood and
don't assign them tasks and work, but
rather teach them to long for the
endless immensity of the sea.
Antoine de Saint-Exupéry, The Little Prince

Finding the right
target audience
for titles
Inform our content
create/purchase
decisions

Implicit data (e.g. count data)
Non-negative constraints
Different matrix norms
Different generative assumptions
Fully bayesian treatment
...

Convex combination of topics
proportions and movie
proportions within topic

Viewers also liked

Sentiment Analysis of Film-Related Messages on Social MediaDomino Data Lab

Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico

ThinkFast: Scaling Machine Learning to Modern DemandsDomino Data Lab

(Some) pitfalls of distributed learningYves Raimond

Balancing Discovery and Continuation in RecommendationsMohammad Hossein Taghavi

Machine Learning at Netflix ScaleAish Fenton

Learning to Rank with Deep Visual Semantic Features - Kamelia, Seniors Data S...WithTheBest

The Right QuestionDomino Data Lab

Open Data for Social GoodDomino Data Lab

Recommendations for Building Machine Learning SoftwareJustin Basilico

An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith

Realtime Learning: Using Triggers to Know What the ?$# is Going OnDomino Data Lab

ARTIFICIAL INTELLIGENCE AT WORKEnrico Busto

Inside Netflix: The Company Culture That Created a $37B BusinessPeopleSpark

Path to continuous deliveryAnirudh Bhatnagar

DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroGaurav "GP" Pal

Viewers also liked (16)

Sentiment Analysis of Film-Related Messages on Social Media

Past, Present & Future of Recommender Systems: An Industry Perspective

ThinkFast: Scaling Machine Learning to Modern Demands

(Some) pitfalls of distributed learning

Balancing Discovery and Continuation in Recommendations

Machine Learning at Netflix Scale

Learning to Rank with Deep Visual Semantic Features - Kamelia, Seniors Data S...

The Right Question

Open Data for Social Good

Recommendations for Building Machine Learning Software

An Architecture for Agile Machine Learning in Real-Time Applications

Realtime Learning: Using Triggers to Know What the ?$# is Going On

ARTIFICIAL INTELLIGENCE AT WORK

Inside Netflix: The Company Culture That Created a $37B Business

Path to continuous delivery

DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

costume and set research powerpoint presentationphoebematthew05

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

APIForce Zurich 5 April Automation LPDGMarianaLemus7

Build your next Gen AI Breakthrough - April 2024Neo4j

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely

Artificial intelligence in the post-deep learning eraDeakin University

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames

My Hashitalk Indonesia April 2024 Presentation

Scanning the Internet for External Cloud Exposures via SSL Certs

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

My INSURER PTE LTD - Insurtech Innovation Award 2024

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

Understanding the Laravel MVC Architecture

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

costume and set research powerpoint presentation

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service

APIForce Zurich 5 April Automation LPDG

Build your next Gen AI Breakthrough - April 2024

Pigging Solutions in Pet Food Manufacturing

Designing IA for AI - Information Architecture Conference 2024

Connect Wave/ connectwave Pitch Deck Presentation

The transition to renewables in India.pdf

Unlocking the Potential of the Cloud for IBM Power Systems

Artificial intelligence in the post-deep learning era

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Machine Learning at Netflix

2. Some background...

10.

11. Research At Netflix

12. But first… why?

13. 1999 Started work on recommendations (DVDs) 2006-2009 Launched Netflix Prize, 1M prize to improve predicted ratings 2007-2011 Transitioned to streaming 2014 300+ people working on content discovery, $150m per year investment.

14. The value of research? ● Business Value ● Consumer <-> Producer

15. How much is 0.1% worth? 83,000,000 members + modest growth $10 * 12 months --- ~5-100+ million a year

16. The value of research? ● Business Value ● Consumer <-> Producer

17.

18.

19. Our approach

20. Freedom and Responsibility

21. If you want to build a ship, don't drum up people to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea. Antoine de Saint-Exupéry, The Little Prince

22.

23. Some examples of WIP:

24. 1.Understanding tastes

25. Finding the right target audience for titles Inform our content create/purchase decisions

26. R ≈ UM

27. R ≈ UM

28. Implicit data (e.g. count data) Non-negative constraints Different matrix norms Different generative assumptions Fully bayesian treatment ...

29. K U P α θ φz r β

30. Convex combination of topics proportions and movie proportions within topic

31. φ3φ2 φ1 User

32.

33.

34. 2. Beyond Recommenders

35.

36. Which Genres

37. Ordering

38. Dedupe

39. Diversity

40.

41. 2 3 4 51 6 7 8 Worth more

42.

43.

44.

45.

46.

47. 3. Going Global

48. Jan 6th, 2016

49. ? ? ? ? ? ? ? ? ?

50. ? ? ? ? ? ? ? ? ? A B C

51. ? ? ? ? ? ? ? ? ?

52. ? ? ? ? ? ? ? ? ? A B

53.

54.

55.

56.

57.

58. 4. Image Optimization

59.

60. Pop Quiz !

61.

62. CONTROL 14% ▲ 6% ▲

63.

64. Anatomy of a boxart...

65.

66.

67.

68.

69.

70.

71.

72. Scaling it UP

73. Visualization

74. Search

75. Testing

76. Stream Optimization

Editor's Notes

Telling stories has always been at the core of human nature. They provide us with a sense of community and let us communicate deeper truths. Major technological breakthroughs have changed society in fundamental ways, and have allowed us to tell richer stories. It’s not hard to imagine our ancestors coming together around a campfire to share and tell stories. And you can see how from that desire to share stories, symbolic representation developed into writing.
And then later the printing press,
And then later again the invention of the TV. Whole new ways to express and understand ourselves through stories where possible.
Today, we’re lucky to be witnessing the changes brought about by the Internet. And like previous technological breakthroughs, the internet is also having a profound impact on how we tell stories. Netflix lies at this cross-roads of the technology and entertainment. We’re Inventing internet TV. In the world of linear-tv, the job of the “ content programmer” was to select what shows were on. And even with 100s of cable TV channels your choice is still limited. The promise of Internet TV is that we can provide 80 million channels. Because each user is their own channel. So producing a completely personalized experience is central to everything we do.
ML is used everywhere at Netflix. In fact, 80% of what is played comes from some form of recommendation system. You’re probably aware that rows such as: “Top Picks” are driven by MLing. But you might not have realized that most of the other rows,
The hero images at the top of the page,
What information (evidence) we show about a video,
And even how we combine all these elements onto a single page, Is all driven by machine learned algorithms that are optimized to provide you with the a completely personalized experienced.
Data Science has always been a core part of Netflix’s DNA. History
So you might be wondering why bother? That’s a large investment. Or you might be wondering how can I convince my boss of that. Well it’s easy to quantify the value to Netflix.
Let’s take some crude numbers. If we improve retention of our members by one tenth of one percent, how much is that worth? If we take our 83M members and assume some modest growth. Then it’s easy to see that even a modest improve in retention can be worth a lot.
Likewise there’s also value the end content consumer, and the show producers.
This chart is from the 2013 Sandvine report and reports the percentage of US downstream internet traffic that is spent on various activities. As you can see Netflix accounts for 1/3rd of all US downstream traffic. Even a modest improvement to our streaming and video encoding can have huge benefits for members in terms of the quality of their experience.
Or I think more profoundly, consider this: One of the limitation of the network TV is that it’s very hard to make niech content work economically. Even with all the choices of cable TV, most viewing happens within a very limited window. 7-10 on weeknights. And the vast majority of channels don’t get much viewership. This crunch means that cable or TV networks need to go for the content with the broadest possible audiences. Anything else doesn’t work economically. So if you’ve even complained about your favorite TV show being cancelled, or about the lack of choice for what to watch tonight. Or, if you’re a content producer, and you’re frustrated that you can’t find anyone to make your show, even though you know you have audience that wants to hear your story. This is part of it. In contrast, Internet TV removes this restriction meaning that there’s no restriction on the audience size. As long as the economics of it work, we’re quite happy to have niech content with a small potential audience. BUT this only works if we do a good job matching content to consumers. Or in other words, of finding that audience. And I have some nice stories about this later on to share.
You may have heard about the Netflix culture. And in particular Freedom and Responsibility. If you haven’t. It means that we give our employees a lot of freedom. But we expect big things from everyone who works at Netflix. For research it means this: we provide researchers with flexibility to develop and try out their own ideas. And in fact we encourage left-field thinking. If think you have an idea that you think will turn everything on its head, then great, try it and see what it does. But the catch is: We hold you responsible for your results. Now in research that doesn’t necessarily mean rolling out new algorithms to production. Although improving the product is the end goal. Most of what we try doesn’t work, and that’s absolutely fine. A high quality test is one that maximizes our learnings. And that is the standard we hold each other to. So you might be wondering if this is complete chaos in practice? Well there is a trick to it. We only hire senior people, and we actively select during hiring for people who are self-motivated, self-directed, and in possession of good judgment. And once we’ve found them, we pay them top-of-market.
Another principle that guides our approach to research is: Context not Control. We don’t have centralized planning for what gets researched, and there is minimal process around launching a test. In many companies you’ll find a hierarchy, where those researchers who have been since the early days decide what to test, and underneath them is an army of recent PhD grads and interns. We keep the hierarchy flat at Netflix, and instead expect everyone to exercise good judgement, and make their own decisions. But obviously we need to aligned, and we need to provide a lot of context so that individuals can make good decisions. And as a manager that’s where I spend a lot of my time. Not on controlling what my folks do, but on connecting the dots for them.
This all comes together and results in us testing around ~500 algorithms a year. Many algorithms look promising offline (metrics) but online results don’t pan out. Often the only way to really see if they work is to test them online. We test against core business metrics, such as member retention and how much Netflix people are watching. This keeps our research grounded. If we get a win, it really does mean we’ve improved the business, and provided a better service to our members too.
Or looking at it geometrically. If we take the simplex created by the distribution over movies Each topic is a point somewhere on this simplex. Since it’s a distribution over movies What our model is saying is that: each user can be represented as a convex combination of these topics If you throw in a non-negative constraint, and normalize the user and movie vectors... you can see the connection to MF. http://mathurl.com/jb8dj9m
Just remind what we’re actually recommending. A list of Genres. Top to Bottom. We need to pick the Genre that An ordering
P(play | u,v)
nDCG = \sum_{i=1}^p \frac{2^{rel_i} -1}{\log_2(i+1)}
Most recommender systems are conceptualized as recommending a single item. Where each item is recommended independently of the other one, with on interaction between them.
In the real-world though, most recommendations are
Kuwait
From Kuwait We see that it’s found a global audience.
Image credit: https://dataorigami.net/blogs/napkin-folding/79031811-multi-armed-bandits
Diversity matters.
Emotion matters
Emotion matters
Location matters: Which did better in Japan?
Location matters: Which did better in Japan?

Machine Learning at Netflix

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (16)

More from Domino Data Lab

More from Domino Data Lab (20)

Recently uploaded

Recently uploaded (20)

Machine Learning at Netflix

Editor's Notes