This is my presentation from the Business of APIs Conference in SF, held by Mashery (http://www.apiconference.com).
This talk talks briefly about the history of the Netflix API, then goes into three main categories of scaling:
1. Using the cloud to scale in size and internationally
2. Using Webkit to scale application development in parallel to the flexibility afforded by the API
3. Redesigning the API to improve performance and to downscale the infrastructure as the system scales
When viewing these slides, please note that they are almost entirely image-based, so I have added notes for each slide to detail the talking points.
29. {"catalog_title": {"id":"http://api.netflix.com/catalog/titles/movies/60034967", "title":{"title_short":"Rosencrantz and Guildenstern Are Dead", "regular":"Rosencrantz and Guildenstern Are Dead"}, "maturity_level":60, "release_year":"1990", "average_rating":3.7, "box_art":{"284pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/ghd/60034967.jpg", "110pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/large/60034967.jpg", "38pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/tiny/60034967.jpg", "64pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/small/60034967.jpg", "150pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/150/60034967.jpg", "88pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/88/60034967.jpg", "124pix_w":"http://cdn-7.nflximg.com/en_US/boxshots/124/60034967.jpg"}, "language":"en", "web_page":"http://www.netflix.com/Movie/Rosencrantz_and_Guildenstern_Are_Dead/60034967", "tiny_url":"http://movi.es/ApUP9"}, "meta":{ "expand":["@directors","@bonus_materials","@cast","@awards","@short_synopsis","@synopsis","@box_art","@screen_formats","@"links":{"id":"http://api.netflix.com/catalog/titles/movies/60034967", "languages_and_audio":"http://api.netflix.com/catalog/titles/movies/60034967/languages_and_audio", "title":"http://api.netflix.com/catalog/titles/movies/60034967/title", "screen_formats":"http://api.netflix.com/catalog/titles/movies/60034967/screen_formats", "cast":"http://api.netflix.com/catalog/titles/movies/60034967/cast", "awards":"http://api.netflix.com/catalog/titles/movies/60034967/awards", "short_synopsis":"http://api.netflix.com/catalog/titles/movies/60034967/short_synopsis", "box_art":"http://api.netflix.com/catalog/titles/movies/60034967/box_art", "synopsis":"http://api.netflix.com/catalog/titles/movies/60034967/synopsis", "directors":"http://api.netflix.com/catalog/titles/movies/60034967/directors", "similars":"http://api.netflix.com/catalog/titles/movies/60034967/similars", "format_availability":"http://api.netflix.com/catalog/titles/movies/60034967/format_availability"} }}
30. Improve Efficiency of API Requests Could it have been 100 million requests per day? Or less? (Assuming everything else remained the same)
Editor's Notes
When the Netflix API launchedin 2008, it was to “let 1,000 flowers bloom”. It was exclusively a public API.
Some of the many applications produced through the public API…
Then streaming started taking off for Netflix, first with computer-based streaming…
And then streaming devices began to increase over the years. At first, the devices did not draw from the API. Over time, however, newer devices began to consume the API and some of the older devices have been retrofitted.Now, the public developer community is just another consumer of the API.
During the growth of the device strategy and the increase in the API adoption, the API slowly became engrained into the DNA of the engineering culture.Now, the engineering organizational structure reflects this. There are many engineering teams internal to Netflix that produce and manage data and/or algorithmic output.There are a range of engineering teams internal to Netflix that create presentation layers on various devices.The API sits between those two groups, in the critical path for the Netflix streaming service. The API essentially brokers content from inside the firewall to outside.
With the emphasis of the API being in the devices, the public developers now represent less than 1% of the total API traffic.
As a result, the private, device-centric API is the emphasis of the Netflix API program going forward. The public API is still supported, but not the emphasis.
Looking back at this adoption rate, we see atremendous growth in the API. Over an 18 month span, we have gone from under 1B requests per month to over 1B requests per day. With trendlines that look like this, one of the primary issues is scaling the API.
And our international expansion will only add complexity and more scaling issues. So, how are we addressing the scale issues?
The cloud! Enables rapid scaling with relative ease. Adding new servers, in new locations, take minutes.
If our server farm looked like this in 2010, in terms of scale…
We would need a server farm like this to serve the increased API traffic. To ramp up this number of servers, it takes systems administrators to acquire and image new boxes, power considerations for data centers, etc. Moreover, adding these servers in data centers for expected spikes results in hardware the has been paid for and deployed, but is not being used.
So, instead of going into big server rooms like this one to scale our system…
We go into a web page like this one, which is part of our internal cloud management toolset to handle our EC2 infrastructure.
And as we continue to expand internationally, through EC2, the API can easily scale up in new regions, closer to the customer base that we are trying to serve, as long as Amazon has a location there.
The API has enabled great ability to build new apps
The API provides great ability to quickly build device apps. Cloud infrastructure helps those apps scale with the company. To enable more nimble development of the apps themselves, Netflix used Webkit.
Netflix Android app is built from the same codebase as the iPhone app. There are key differences, but the iPhone codebase can be leveraged here in ways that a native app cannot.
We also need to improve the application.
The next phase of improvement is to redesign the API.In essence, while the current API is capable of serving us in the way we need, it is probably no longer the best tool for the job. We believe we can do much better with a new API that is designed for the future of Netflix.
We already talked about the tremendous growth in API requests…
And one billion requests a day sounds great, doesn’t it?For us, this number is a bit concerning…
In the web world, increasing request numbers mean increasing opportunity of ad impressions, which means increasing opportunity for generating revenue. And when you hit certain thresholds in impressions, CPMs start to rise, which means even more money.
That is why some media companies have stories spanning multiple pages, etc.
But for systems that yield output that looks like this…
And this…Ad impressions are not part of the game. As a result, the increase in requests don’t translate into more revenue. In fact, they translate into more expenses. That is, to handle more requests requires more servers, more systems-admins, a potentially different application architecture, etc.
So, weare challenging ourselves to redesign the API to see if those same one billion requests could have been 100 million or perhaps even less. Through more targeted API designs based on what we have learned through our metrics, we will be able to reduce our API traffic as Netflix’ overall traffic grows.
As we decrease overall traffic,our server count that currently looks something like this…
Could end up looking more like this. Lower server counts means reduced costs, simpler implementations, etc.
The ultimate goal, however, is to help our device apps to run as a fast as possible. And reducing the requests with a less chatty API will improve the overall performance for the devices.
Which, in turn, will help keep our customers happy.