Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Maintaining the Front Door to Netflix
Ben Schmaus, @schmaus
APIcon SF - 2014
Global Streaming Video
for TV Shows and Movies
More than 48 Million Subscribers
More than 40 Countries
Netflix Accounts for >34% of Peak
Downstream Traffic in North America
Netflix subscribers are watching more than 1 billion...
Netflix Accounts for >6% of Peak
Upstream Traffic in North America
Netflix subscribers are watching more than 1 billion ho...
Team Focus:
Build the Best Global Streaming Product
Three aspects of the Streaming Product:
• Non-Member
• Discovery
• Str...
Netflix API : Key Responsibilities
• Broker data between services and Devices
• Provide features and business logic
• Main...
Netflix API : Key Responsibilities
• Broker data between services and Devices
• Provide features and business logic
• Main...
APIs Do
Lots of Things!
Data Gathering
Data Formatting
Data Delivery
Security
Authorization
Authentication
System Scaling
Discoverability
Data Con...
Data Gathering
Data Formatting
Data Delivery
Security
Authorization
Authentication
System Scaling
Discoverability
Data Con...
Definitions
• Data Gathering
– Retrieving the requested data from one or many local
or remote data sources
• Data Formatti...
Meanwhile…
There are two players in APIs
API Provider
API Provider API Consumer
API Provider
PROVIDES
API Consumer
CONSUMES
Traditional API Interactions
API Provider
PROVIDES
EVERYTHING
API Consumer
CONSUMES
Everything means, API Provider does:
• Data Gathering
• Data Format...
Why do most API providers provide
everything?
• API design tends to be easier for teams closer
to the source
• Centralized...
Why do most API providers provide
everything?
• API design tends to be easier for teams closer
to the source
• Centralized...
Data Gathering Data Formatting Data Delivery
API Consumer
API Provider
Separation of Concerns
To be a better provider, the...
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data
is gathered, as long
as it is gathered
API P...
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data
is gathered, as long
as it is gathered
Each ...
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data
is gathered, as long
as it is gathered
Each ...
Lsud vs. sskd
Should you consider alternatives to
one-size-fits-all API model?
Ingredients:
• Small number of targeted API consumers is ...
• http://thenextweb.com/dd/2013/12/17/futur
e-api-design-orchestration-layer/
Because of our separation of
concerns, the Netflix API team is
enabled to focus on different charters
Brokering Data to
1,000+ Device Types
Screen Real Estate
Controller
Technical Capabilities
One-Size-Fits-All
API
Request
Request
Request
Courtesy of South Florida Classical Review
Resource-Based API
vs.
Experience-Based API
Resource-Based Requests
• /users/<id>/ratings/title
• /users/<id>/queues
• /users/<id>/queues/instant
• /users/<id>/recomm...
REST API
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Network Border Network Bo...
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
OSFA API
Network Border Network Bo...
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
OSFA API
Network Border Network Bo...
Experience-Based Requests
• /ps3/homescreen
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RAT...
RECOMME
NDATIONSA
ZXSXX C
CCC
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
JAVA API
SERVER CODE
...
RECOMME
NDATIONSA
ZXSXX C
CCC
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
JAVA API
DATA GATHERI...
Because we are no longer
catering to LSUDs
We can now focus on building a
business on which all of Netflix
can operate
Netflix API : Key Responsibilities
• Broker data between services and Devices
• Provide features and business logic
• Main...
The bigger the ship…
the slower it turns
Distributed Architecture
1000+ Device Types
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
Reviews
A/B Test
Engine
Dozens of Dependenci...
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Dependency Relationships
2,000,000,000
Requests Per Day to the
Netflix API
30
Distinct Dependent
Services for the Netflix API
~600
Dependency jars Slurped
into the Netflix API
14,000,000,000
Netflix API Calls Per Day to
those Dependent Services
0
Dependent Services with
100% SLA
99.99% = 99.7%30
0.3% of 2B = 6M failures per day
2+ Hours of Downtime
Per Month
99.99% = 99.7%30
0.3% of 2B = 6M failures per day
2+ Hours of Downtime
Per Month
99.9% = 97%30
3% of 2B = 60M failures per day
20+ Hours of Downtime
Per Month
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Circuit Breaker Dashboard
Call Volume and Health / Last 10 Seconds
Call Volume / Last 2 Minutes
Successful Requests
Successful, But Slower Than Expected
Short-Circuited Requests, Delivering Fallbacks
Timeouts, Delivering Fallbacks
Thread Pool & Task Queue Full, Delivering Fallbacks
Exceptions, Delivering Fallbacks
Error Rate
# + # + # + # / (# + # + # + # + #) = Error Rate
Status of Fallback Circuit
Requests per Second, Over Last 10 Seconds
SLA Information
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Fallback
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Fallback
Netflix API : Key Responsibilities
• Broker data between services and Devices
• Provide features and business logic
• Main...
Scaling the Distributed System
AWS Cloud
Autoscaling
Autoscaling
Scryer : Predictive Auto Scaling
Not yet…
Typical Traffic Patterns Over Five Days
Predicted RPS Compared to Actual RPS
More than 48 Million Subscribers
More than 40 Countries
Zuul
Gatekeeper for the Netflix Streaming Application
Zuul *
• Multi-Region
Resiliency
• Insights
• Stress Testing
• Canary Testing
• Dynamic Routing
• Load Shedding
• Security...
All of these approaches are
designed to prevent failures…
But sometimes the best way to
prevent failures is to force them!
I randomly
terminate instances
in production to
identify dormant
failures.
Chaos
Monkey
Netflix API : Key Responsibilities
• Broker data between services and Devices
• Provide features and business logic
• Main...
Dependency Relationships
Testing Philosophy:
Act Fast, React Fast
That Doesn’t Mean We Don’t Test
Automated Delivery Pipeline
Cloud-Based Deployment Techniques
Current Code
In Production
API Requests from
the Internet
Single Canary Instance
To Test New Code with Production Traffic
(around 1% or less of traffic)
Current Code
In Production
...
Canary Analysis Automation
Single Canary Instance
To Test New Code with Production Traffic
(around 1% or less of traffic)
Current Code
In Production
...
Current Code
In Production
API Requests from
the Internet
Current Code
In Production
API Requests from
the Internet
Current Code
In Production
API Requests from
the Internet
Perfect!
Stress Test with Zuul
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Error!
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
Perfect!
Stress Test with Zuul
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
API Requests from
the Internet
New Code
Getting Prepared for Production
https://www.github.com/Netflix
Maintaining the Front Door to Netflix
Ben Schmaus, @schmaus
APIcon SF - 2014
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Upcoming SlideShare
Loading in …5
×

Maintaining the Front Door to Netflix

Talk about the Netflix API and how it serves as the front door for Netflix device UIs. Topics include: API design, resiliency patterns, scalability, and enabling fast dev/deploy cycles.

  • Be the first to comment

Maintaining the Front Door to Netflix

  1. 1. Maintaining the Front Door to Netflix Ben Schmaus, @schmaus APIcon SF - 2014
  2. 2. Global Streaming Video for TV Shows and Movies
  3. 3. More than 48 Million Subscribers More than 40 Countries
  4. 4. Netflix Accounts for >34% of Peak Downstream Traffic in North America Netflix subscribers are watching more than 1 billion hours a month
  5. 5. Netflix Accounts for >6% of Peak Upstream Traffic in North America Netflix subscribers are watching more than 1 billion hours a month
  6. 6. Team Focus: Build the Best Global Streaming Product Three aspects of the Streaming Product: • Non-Member • Discovery • Streaming
  7. 7. Netflix API : Key Responsibilities • Broker data between services and Devices • Provide features and business logic • Maintain a resilient front-door • Scale the system • Maintain high velocity
  8. 8. Netflix API : Key Responsibilities • Broker data between services and Devices • Provide features and business logic • Maintain a resilient front-door • Scale the system • Maintain high velocity
  9. 9. APIs Do Lots of Things!
  10. 10. Data Gathering Data Formatting Data Delivery Security Authorization Authentication System Scaling Discoverability Data Consistency Translations Throttling Orchestration APIs Do Lots of Things! These are some of the many things APIs do.
  11. 11. Data Gathering Data Formatting Data Delivery Security Authorization Authentication System Scaling Discoverability Data Consistency Translations Throttling Orchestration APIs Do Lots of Things! These three are at the core. All others ultimately support them.
  12. 12. Definitions • Data Gathering – Retrieving the requested data from one or many local or remote data sources • Data Formatting – Preparing a structured payload to the requesting agent • Data Delivery – Delivering the structured payload to the requesting agent
  13. 13. Meanwhile… There are two players in APIs
  14. 14. API Provider
  15. 15. API Provider API Consumer
  16. 16. API Provider PROVIDES API Consumer CONSUMES Traditional API Interactions
  17. 17. API Provider PROVIDES EVERYTHING API Consumer CONSUMES Everything means, API Provider does: • Data Gathering • Data Formatting • Data Delivery • (among other things) Traditional API Interactions
  18. 18. Why do most API providers provide everything? • API design tends to be easier for teams closer to the source • Centralized API functions makes them easier to support • Many APIs have a large set of unknown and external developers
  19. 19. Why do most API providers provide everything? • API design tends to be easier for teams closer to the source • Centralized API functions makes them easier to support • Many APIs have a large set of unknown and external developers
  20. 20. Data Gathering Data Formatting Data Delivery API Consumer API Provider Separation of Concerns To be a better provider, the API should address the separation of concerns of the three core functions
  21. 21. Data Gathering Data Formatting Data Delivery API Consumer Don’t care how data is gathered, as long as it is gathered API Provider Care a lot about how the data is gathered Separation of Concerns
  22. 22. Data Gathering Data Formatting Data Delivery API Consumer Don’t care how data is gathered, as long as it is gathered Each consumer cares a lot about the format for that specific use API Provider Care a lot about how the data is gathered Only cares about the format to the extent it is easy to support Separation of Concerns
  23. 23. Data Gathering Data Formatting Data Delivery API Consumer Don’t care how data is gathered, as long as it is gathered Each consumer cares a lot about the format for that specific use Each consumer cares a lot about how payload is delivered API Provider Care a lot about how the data is gathered Only cares about the format to the extent it is easy to support Only cares about delivery method to the extent it is easy to support Separation of Concerns
  24. 24. Lsud vs. sskd
  25. 25. Should you consider alternatives to one-size-fits-all API model? Ingredients: • Small number of targeted API consumers is top priority • Close relationships between these API consumers and the API team • Increasing divergence of needs across the top priority API consumers • Strong desire by the API consumers for more optimized interactions with the API • High value proposition for the company providing the API to make these API consumers as effective as possible
  26. 26. • http://thenextweb.com/dd/2013/12/17/futur e-api-design-orchestration-layer/
  27. 27. Because of our separation of concerns, the Netflix API team is enabled to focus on different charters
  28. 28. Brokering Data to 1,000+ Device Types
  29. 29. Screen Real Estate
  30. 30. Controller
  31. 31. Technical Capabilities
  32. 32. One-Size-Fits-All API Request Request Request
  33. 33. Courtesy of South Florida Classical Review
  34. 34. Resource-Based API vs. Experience-Based API
  35. 35. Resource-Based Requests • /users/<id>/ratings/title • /users/<id>/queues • /users/<id>/queues/instant • /users/<id>/recommendations • /catalog/titles/movie • /catalog/titles/series • /catalog/people
  36. 36. REST API RECOMME NDATIONS MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS Network Border Network Border
  37. 37. RECOMME NDATIONS MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS OSFA API Network Border Network Border SERVER CODE CLIENT CODE
  38. 38. RECOMME NDATIONS MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS OSFA API Network Border Network Border DATA GATHERING, FORMATTING, AND DELIVERY USER INTERFACE RENDERING
  39. 39. Experience-Based Requests • /ps3/homescreen
  40. 40. JAVA API Network Border Network Border RECOMME NDATIONS MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS Groovy Layer
  41. 41. RECOMME NDATIONSA ZXSXX C CCC MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS JAVA API SERVER CODE CLIENT CODE CLIENT ADAPTER CODE (WRITTEN BY CLIENT TEAMS, DYNAMICALLY UPLOADED TO SERVER) Network Border Network Border
  42. 42. RECOMME NDATIONSA ZXSXX C CCC MOVIE DATA SIMILAR MOVIES AUTH MEMBER DATA A/B TESTS START- UP RATINGS JAVA API DATA GATHERING DATA FORMATTING AND DELIVERY USER INTERFACE RENDERING Network Border Network Border
  43. 43. Because we are no longer catering to LSUDs We can now focus on building a business on which all of Netflix can operate
  44. 44. Netflix API : Key Responsibilities • Broker data between services and Devices • Provide features and business logic • Maintain a resilient front-door • Scale the system • Maintain high velocity
  45. 45. The bigger the ship… the slower it turns
  46. 46. Distributed Architecture
  47. 47. 1000+ Device Types
  48. 48. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies Reviews A/B Test Engine Dozens of Dependencies
  49. 49. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  50. 50. Dependency Relationships
  51. 51. 2,000,000,000 Requests Per Day to the Netflix API
  52. 52. 30 Distinct Dependent Services for the Netflix API
  53. 53. ~600 Dependency jars Slurped into the Netflix API
  54. 54. 14,000,000,000 Netflix API Calls Per Day to those Dependent Services
  55. 55. 0 Dependent Services with 100% SLA
  56. 56. 99.99% = 99.7%30 0.3% of 2B = 6M failures per day 2+ Hours of Downtime Per Month
  57. 57. 99.99% = 99.7%30 0.3% of 2B = 6M failures per day 2+ Hours of Downtime Per Month
  58. 58. 99.9% = 97%30 3% of 2B = 60M failures per day 20+ Hours of Downtime Per Month
  59. 59. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  60. 60. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  61. 61. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  62. 62. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  63. 63. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  64. 64. Circuit Breaker Dashboard
  65. 65. Call Volume and Health / Last 10 Seconds
  66. 66. Call Volume / Last 2 Minutes
  67. 67. Successful Requests
  68. 68. Successful, But Slower Than Expected
  69. 69. Short-Circuited Requests, Delivering Fallbacks
  70. 70. Timeouts, Delivering Fallbacks
  71. 71. Thread Pool & Task Queue Full, Delivering Fallbacks
  72. 72. Exceptions, Delivering Fallbacks
  73. 73. Error Rate # + # + # + # / (# + # + # + # + #) = Error Rate
  74. 74. Status of Fallback Circuit
  75. 75. Requests per Second, Over Last 10 Seconds
  76. 76. SLA Information
  77. 77. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  78. 78. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  79. 79. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine
  80. 80. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine Fallback
  81. 81. Personaliz ation Engine User Info Movie Metadata Movie Ratings Similar Movies API Reviews A/B Test Engine Fallback
  82. 82. Netflix API : Key Responsibilities • Broker data between services and Devices • Provide features and business logic • Maintain a resilient front-door • Scale the system • Maintain high velocity
  83. 83. Scaling the Distributed System
  84. 84. AWS Cloud
  85. 85. Autoscaling
  86. 86. Autoscaling
  87. 87. Scryer : Predictive Auto Scaling Not yet…
  88. 88. Typical Traffic Patterns Over Five Days
  89. 89. Predicted RPS Compared to Actual RPS
  90. 90. More than 48 Million Subscribers More than 40 Countries
  91. 91. Zuul Gatekeeper for the Netflix Streaming Application
  92. 92. Zuul * • Multi-Region Resiliency • Insights • Stress Testing • Canary Testing • Dynamic Routing • Load Shedding • Security • Static Response Handling • Authentication * Most closely resembles an API proxy
  93. 93. All of these approaches are designed to prevent failures…
  94. 94. But sometimes the best way to prevent failures is to force them!
  95. 95. I randomly terminate instances in production to identify dormant failures. Chaos Monkey
  96. 96. Netflix API : Key Responsibilities • Broker data between services and Devices • Provide features and business logic • Maintain a resilient front-door • Scale the system • Maintain high velocity
  97. 97. Dependency Relationships
  98. 98. Testing Philosophy: Act Fast, React Fast
  99. 99. That Doesn’t Mean We Don’t Test
  100. 100. Automated Delivery Pipeline
  101. 101. Cloud-Based Deployment Techniques
  102. 102. Current Code In Production API Requests from the Internet
  103. 103. Single Canary Instance To Test New Code with Production Traffic (around 1% or less of traffic) Current Code In Production API Requests from the Internet
  104. 104. Canary Analysis Automation
  105. 105. Single Canary Instance To Test New Code with Production Traffic (around 1% or less of traffic) Current Code In Production API Requests from the Internet Error!
  106. 106. Current Code In Production API Requests from the Internet
  107. 107. Current Code In Production API Requests from the Internet
  108. 108. Current Code In Production API Requests from the Internet Perfect!
  109. 109. Stress Test with Zuul
  110. 110. Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  111. 111. Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  112. 112. Error! Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  113. 113. Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  114. 114. Current Code In Production API Requests from the Internet Perfect!
  115. 115. Stress Test with Zuul
  116. 116. Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  117. 117. Current Code In Production API Requests from the Internet New Code Getting Prepared for Production
  118. 118. API Requests from the Internet New Code Getting Prepared for Production
  119. 119. https://www.github.com/Netflix
  120. 120. Maintaining the Front Door to Netflix Ben Schmaus, @schmaus APIcon SF - 2014

    Be the first to comment

    Login to see the comments

  • ecisur

    Jul. 16, 2014
  • hiro_ist

    Jan. 26, 2015
  • unlimitedfocus

    Feb. 13, 2015
  • flyisland

    Feb. 27, 2016

Talk about the Netflix API and how it serves as the front door for Netflix device UIs. Topics include: API design, resiliency patterns, scalability, and enabling fast dev/deploy cycles.

Views

Total views

2,411

On Slideshare

0

From embeds

0

Number of embeds

169

Actions

Downloads

0

Shares

0

Comments

0

Likes

4

×