13. 1999 Started work on recommendations
(DVDs)
2006-2009 Launched Netflix Prize, 1M prize to improve
predicted ratings
2007-2011 Transitioned to streaming
2014 300+ people working on content
discovery, $150m per year investment.
14. The value of research?
● Business Value
● Consumer <-> Producer
15. How much is 0.1% worth?
83,000,000 members + modest growth
$10 * 12 months
---
~5-100+ million a year
16. The value of research?
● Business Value
● Consumer <-> Producer
21. If you want to build a ship, don't
drum up people to collect wood and
don't assign them tasks and work, but
rather teach them to long for the
endless immensity of the sea.
Antoine de Saint-Exupéry, The Little Prince
Telling stories has always been at the core of human nature. They provide us with a sense of community and let us communicate deeper truths.
Major technological breakthroughs have changed society in fundamental ways, and have allowed us to tell richer stories.
It’s not hard to imagine our ancestors coming together around a campfire to share and tell stories. And you can see how from that desire to share stories, symbolic representation developed into writing.
And then later the printing press,
And then later again the invention of the TV.
Whole new ways to express and understand ourselves through stories where possible.
Today, we’re lucky to be witnessing the changes brought about by the Internet. And like previous technological breakthroughs, the internet is also having a profound impact on how we tell stories.
Netflix lies at this cross-roads of the technology and entertainment. We’re Inventing internet TV.
In the world of linear-tv, the job of the “ content programmer” was to select what shows were on. And even with 100s of cable TV channels your choice is still limited.
The promise of Internet TV is that we can provide 80 million channels. Because each user is their own channel.
So producing a completely personalized experience is central to everything we do.
ML is used everywhere at Netflix. In fact, 80% of what is played comes from some form of recommendation system.
You’re probably aware that rows such as: “Top Picks” are driven by MLing.
But you might not have realized that most of the other rows,
The hero images at the top of the page,
What information (evidence) we show about a video,
And even how we combine all these elements onto a single page,
Is all driven by machine learned algorithms that are optimized to provide you with the a completely personalized experienced.
Data Science has always been a core part of Netflix’s DNA.
History
So you might be wondering why bother? That’s a large investment. Or you might be wondering how can I convince my boss of that.
Well it’s easy to quantify the value to Netflix.
Let’s take some crude numbers.
If we improve retention of our members by one tenth of one percent, how much is that worth?
If we take our 83M members and assume some modest growth.
Then it’s easy to see that even a modest improve in retention can be worth a lot.
Likewise there’s also value the end content consumer, and the show producers.
This chart is from the 2013 Sandvine report and reports the percentage of US downstream internet traffic that is spent on various activities. As you can see Netflix accounts for 1/3rd of all US downstream traffic.
Even a modest improvement to our streaming and video encoding can have huge benefits for members in terms of the quality of their experience.
Or I think more profoundly, consider this:
One of the limitation of the network TV is that it’s very hard to make niech content work economically.
Even with all the choices of cable TV, most viewing happens within a very limited window. 7-10 on weeknights. And the vast majority of channels don’t get much viewership. This crunch means that cable or TV networks need to go for the content with the broadest possible audiences. Anything else doesn’t work economically.
So if you’ve even complained about your favorite TV show being cancelled, or about the lack of choice for what to watch tonight. Or, if you’re a content producer, and you’re frustrated that you can’t find anyone to make your show, even though you know you have audience that wants to hear your story. This is part of it.
In contrast, Internet TV removes this restriction meaning that there’s no restriction on the audience size. As long as the economics of it work, we’re quite happy to have niech content with a small potential audience.
BUT this only works if we do a good job matching content to consumers. Or in other words, of finding that audience.
And I have some nice stories about this later on to share.
You may have heard about the Netflix culture. And in particular Freedom and Responsibility.
If you haven’t. It means that we give our employees a lot of freedom. But we expect big things from everyone who works at Netflix.
For research it means this: we provide researchers with flexibility to develop and try out their own ideas. And in fact we encourage left-field thinking. If think you have an idea that you think will turn everything on its head, then great, try it and see what it does.
But the catch is: We hold you responsible for your results.
Now in research that doesn’t necessarily mean rolling out new algorithms to production. Although improving the product is the end goal. Most of what we try doesn’t work, and that’s absolutely fine. A high quality test is one that maximizes our learnings. And that is the standard we hold each other to.
So you might be wondering if this is complete chaos in practice? Well there is a trick to it.
We only hire senior people, and we actively select during hiring for people who are self-motivated, self-directed, and in possession of good judgment. And once we’ve found them, we pay them top-of-market.
Another principle that guides our approach to research is: Context not Control.
We don’t have centralized planning for what gets researched, and there is minimal process around launching a test.
In many companies you’ll find a hierarchy, where those researchers who have been since the early days decide what to test, and underneath them is an army of recent PhD grads and interns.
We keep the hierarchy flat at Netflix, and instead expect everyone to exercise good judgement, and make their own decisions.
But obviously we need to aligned, and we need to provide a lot of context so that individuals can make good decisions.
And as a manager that’s where I spend a lot of my time. Not on controlling what my folks do, but on connecting the dots for them.
This all comes together and results in us testing around ~500 algorithms a year.
Many algorithms look promising offline (metrics) but online results don’t pan out. Often the only way to really see if they work is to test them online.
We test against core business metrics, such as member retention and how much Netflix people are watching. This keeps our research grounded. If we get a win, it really does mean we’ve improved the business, and provided a better service to our members too.
Or looking at it geometrically.
If we take the simplex created by the distribution over movies
Each topic is a point somewhere on this simplex. Since it’s a distribution over movies
What our model is saying is that: each user can be represented as a convex combination of these topics
If you throw in a non-negative constraint, and normalize the user and movie vectors... you can see the connection to MF.
http://mathurl.com/jb8dj9m
Just remind what we’re actually recommending.
A list of Genres. Top to Bottom. We need to pick the Genre that
An ordering
Most recommender systems are conceptualized as recommending a single item.
Where each item is recommended independently of the other one, with on interaction between them.
In the real-world though, most recommendations are
Kuwait
From Kuwait
We see that it’s found a global audience.