SlideShare a Scribd company logo
1 of 92
Scaling the
Netflix API
Daniel Jacobson
@daniel_jacobson
http://www.linkedin.com/in/danieljacobson
http://www.slideshare.net/danieljacobson
Please read the notes
associated with each slide for
the full context of the
presentation
What do I mean by “scale”?
Netflix API : Requests Per Month
-
5
10
15
20
25
30
35
RequestsinBillions
Netflix API : Requests Per Month
-
5
10
15
20
25
30
35
RequestsinBillions
But There Are Many Ways to Scale!
Organization
Systems
Devices
Development
Testing
Scaling…
Organization
Systems
Devices
Development
Testing
Streaming
More than 36 Million Subscribers
More than 40 Countries
Netflix Accounts for ~33% of Peak
Internet Traffic in North America
Netflix subscribers are watching more than 1 billion hours a month
In other words,
pretty big scale…
Organization Structure
Distributed Architecture
1000+ Device Types
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
Reviews
A/B Test
Engine
Dozens of Dependencies
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
http://www.slideshare.net/reed2001/culture-1798664
Scaling…
Organization
Systems
Devices
Development
Testing
Scaling the System
Growth of Netflix API Requests
0.6
20.7
41.7
-
5
10
15
20
25
30
35
40
45
Jan-10 Jan-11 Jan-12
RequestinBillions
70x growth in two years!
Growth of Netflix API Requests
2+ billion requests per day
Exploding out to 14 billion dependency calls per day
AWS Cloud
Autoscaling
Autoscaling
System Resiliency
Dependency Relationships
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Circuit Breaker Dashboard
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Fallback
Personaliz
ation
Engine
User Info
Movie
Metadata
Movie
Ratings
Similar
Movies
API
Reviews
A/B Test
Engine
Fallback
Forced Failure
Global System
More than 36 Million Subscribers
More than 40 Countries
Zuul
Gatekeeper for the Netflix Streaming Application
Zuul
• Multi-Region
Resiliency
• Insights
• Stress Testing
• Canary Testing
• Dynamic Routing
• Load Shedding
• Security
• Static Response
Handling
• Authentication
Isthmus
Scaling…
Organization
Systems
Devices
Development
Testing
Screen Real Estate
Controller
Technical Capabilities
One-Size-Fits-All
API
Request
Request
Request
Scaling…
Organization
Systems
Devices
Development
Testing
Courtesy of South Florida Classical Review
Resource-Based API
vs.
Experience-Based API
Resource-Based Requests
• /users/<id>/ratings/title
• /users/<id>/queues
• /users/<id>/queues/instant
• /users/<id>/recommendations
• /catalog/titles/movie
• /catalog/titles/series
• /catalog/people
REST API
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
OSFA API
Network Border Network Border
SERVER CODE
CLIENT CODE
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
OSFA API
Network Border Network Border
DATA GATHERING,
FORMATTING,
AND DELIVERY
USER INTERFACE
RENDERING
Experience-Based Requests
• /ps3/homescreen
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Groovy Layer
RECOMME
NDATIONSA
ZXSXX C
CCC
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
JAVA API
SERVER CODE
CLIENT CODE
CLIENT ADAPTER CODE
(WRITTEN BY CLIENT TEAMS, DYNAMICALLY UPLOADED TO SERVER)
Network Border Network Border
RECOMME
NDATIONSA
ZXSXX C
CCC
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
JAVA API
DATA GATHERING
DATA FORMATTING
AND DELIVERY
USER INTERFACE
RENDERING
Network Border Network Border
Scaling…
Organization
Systems
Devices
Development
Testing
Dependency Relationships
API alone includes more than 500 client jars
Testing Philosophy:
Act Fast, React Fast
That Doesn’t Mean We Don’t Test
• Unit tests
• Functional tests
• Regression scripts
• Continuous integration
• Capacity planning
• Load / Performance tests
Cloud-Based Deployment Techniques
Current Code
In Production
API Requests from
the Internet
Single Canary Instance
To Test New Code with Production Traffic
(around 1% or less of traffic)
Current Code
In Production
API Requests from
the Internet
Error!
Current Code
In Production
API Requests from
the Internet
Current Code
In Production
API Requests from
the Internet
Perfect!
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Error!
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
Perfect!
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
Current Code
In Production
API Requests from
the Internet
New Code
Getting Prepared for Production
API Requests from
the Internet
New Code
Getting Prepared for Production
https://www.github.com/Netflix
Scaling the
Netflix API
Daniel Jacobson
@daniel_jacobson
http://www.linkedin.com/in/danieljacobson
http://www.slideshare.net/danieljacobson

More Related Content

What's hot

Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDBAmazon Web Services
 
Making the Product Strategy Effective by Spotify Sr PM
Making the Product Strategy Effective by Spotify Sr PMMaking the Product Strategy Effective by Spotify Sr PM
Making the Product Strategy Effective by Spotify Sr PMProduct School
 
Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...LazarinaStoyanova
 
How to be a Successful Data PM by Zillow Product Leaders
How to be a Successful Data PM by Zillow Product LeadersHow to be a Successful Data PM by Zillow Product Leaders
How to be a Successful Data PM by Zillow Product LeadersProduct School
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
REST vs GraphQL
REST vs GraphQLREST vs GraphQL
REST vs GraphQLSquareboat
 
SEO, Marketing & AI: Your Questions Answered By Our Experts
SEO, Marketing & AI: Your Questions Answered By Our ExpertsSEO, Marketing & AI: Your Questions Answered By Our Experts
SEO, Marketing & AI: Your Questions Answered By Our ExpertsSearch Engine Journal
 
트위터의 추천 시스템 파헤치기
트위터의 추천 시스템 파헤치기트위터의 추천 시스템 파헤치기
트위터의 추천 시스템 파헤치기Yan So
 
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Christopher Gutknecht
 
Test-Driven Development in React with Cypress
Test-Driven Development in React with CypressTest-Driven Development in React with Cypress
Test-Driven Development in React with CypressJosh Justice
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Automation With Appium
Automation With AppiumAutomation With Appium
Automation With AppiumKnoldus Inc.
 
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...Amazon Web Services
 
AI in Action The New Age of Intelligent Products and Sales Automation
AI in Action The New Age of Intelligent Products and Sales AutomationAI in Action The New Age of Intelligent Products and Sales Automation
AI in Action The New Age of Intelligent Products and Sales AutomationProduct School
 
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...luisw19
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...BigDataExpo
 
ASO : the game has changed
ASO : the game has changedASO : the game has changed
ASO : the game has changedThomasBCN
 
Local Search in 2023 - Must-Know and Must-Do Tactics
Local Search in 2023 - Must-Know and Must-Do TacticsLocal Search in 2023 - Must-Know and Must-Do Tactics
Local Search in 2023 - Must-Know and Must-Do TacticsBenu Aggarwal
 

What's hot (20)

Cypress
CypressCypress
Cypress
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDB
 
Becoming an IAM Policy Ninja
Becoming an IAM Policy NinjaBecoming an IAM Policy Ninja
Becoming an IAM Policy Ninja
 
Making the Product Strategy Effective by Spotify Sr PM
Making the Product Strategy Effective by Spotify Sr PMMaking the Product Strategy Effective by Spotify Sr PM
Making the Product Strategy Effective by Spotify Sr PM
 
Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...Intent-Based International Keyword Research - International Search Summit, Ba...
Intent-Based International Keyword Research - International Search Summit, Ba...
 
How to be a Successful Data PM by Zillow Product Leaders
How to be a Successful Data PM by Zillow Product LeadersHow to be a Successful Data PM by Zillow Product Leaders
How to be a Successful Data PM by Zillow Product Leaders
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
REST vs GraphQL
REST vs GraphQLREST vs GraphQL
REST vs GraphQL
 
SEO, Marketing & AI: Your Questions Answered By Our Experts
SEO, Marketing & AI: Your Questions Answered By Our ExpertsSEO, Marketing & AI: Your Questions Answered By Our Experts
SEO, Marketing & AI: Your Questions Answered By Our Experts
 
트위터의 추천 시스템 파헤치기
트위터의 추천 시스템 파헤치기트위터의 추천 시스템 파헤치기
트위터의 추천 시스템 파헤치기
 
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
 
Test-Driven Development in React with Cypress
Test-Driven Development in React with CypressTest-Driven Development in React with Cypress
Test-Driven Development in React with Cypress
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Automation With Appium
Automation With AppiumAutomation With Appium
Automation With Appium
 
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
NEW LAUNCH! Amazon Neptune Overview and Customer Use Cases - DAT319 - re:Inve...
 
AI in Action The New Age of Intelligent Products and Sales Automation
AI in Action The New Age of Intelligent Products and Sales AutomationAI in Action The New Age of Intelligent Products and Sales Automation
AI in Action The New Age of Intelligent Products and Sales Automation
 
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...
GraphQL as an alternative approach to REST (as presented at Java2Days/CodeMon...
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
 
ASO : the game has changed
ASO : the game has changedASO : the game has changed
ASO : the game has changed
 
Local Search in 2023 - Must-Know and Must-Do Tactics
Local Search in 2023 - Must-Know and Must-Do TacticsLocal Search in 2023 - Must-Know and Must-Do Tactics
Local Search in 2023 - Must-Know and Must-Do Tactics
 

Viewers also liked

Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenDaniel Jacobson
 
Netflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceNetflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceDaniel Jacobson
 
Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Daniel Jacobson
 
Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIDaniel Jacobson
 
Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007Alex Payne
 
Scaling the Netflix API - OSCON
Scaling the Netflix API - OSCONScaling the Netflix API - OSCON
Scaling the Netflix API - OSCONDaniel Jacobson
 
APIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceAPIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceDaniel Jacobson
 
Netflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFNetflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFDaniel Jacobson
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupDaniel Jacobson
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionDaniel Jacobson
 
Presentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIPresentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIDaniel Jacobson
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog Datadog
 
Netflix API - Separation of Concerns
Netflix API - Separation of ConcernsNetflix API - Separation of Concerns
Netflix API - Separation of ConcernsDaniel Jacobson
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Thingsroyrapoport
 
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Daniel Jacobson
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling TwitterBlaine
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleSudhir Tonse
 
Techniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFTechniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFDaniel Jacobson
 

Viewers also liked (20)

Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev Den
 
Netflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech ConferenceNetflix API: Keynote at Disney Tech Conference
Netflix API: Keynote at Disney Tech Conference
 
Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016
 
Maintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix APIMaintaining the Front Door to Netflix : The Netflix API
Maintaining the Front Door to Netflix : The Netflix API
 
Culture
CultureCulture
Culture
 
Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007
 
Scaling the Netflix API - OSCON
Scaling the Netflix API - OSCONScaling the Netflix API - OSCON
Scaling the Netflix API - OSCON
 
APIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev ConferenceAPIs for Internal Audiences - Netflix - App Dev Conference
APIs for Internal Audiences - Netflix - App Dev Conference
 
Netflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SFNetflix API : BAPI 2011 Presentation : SF
Netflix API : BAPI 2011 Presentation : SF
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
 
History and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of DistributionHistory and Future of the Netflix API - Mashery Evolution of Distribution
History and Future of the Netflix API - Mashery Evolution of Distribution
 
Presentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix APIPresentation to ESPN about the Netflix API
Presentation to ESPN about the Netflix API
 
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
 
Netflix API - Separation of Concerns
Netflix API - Separation of ConcernsNetflix API - Separation of Concerns
Netflix API - Separation of Concerns
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Things
 
Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling Twitter
 
From SOA to MSA
From SOA to MSAFrom SOA to MSA
From SOA to MSA
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 
Techniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SFTechniques for Scaling the Netflix API - QCon SF
Techniques for Scaling the Netflix API - QCon SF
 

Similar to Scaling the Netflix API

Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to NetflixBenjamin Schmaus
 
Set Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRSet Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRDaniel Jacobson
 
Netflix API - Presentation to PayPal
Netflix API - Presentation to PayPalNetflix API - Presentation to PayPal
Netflix API - Presentation to PayPalDaniel Jacobson
 
API Strategy Evolution at Netflix
API Strategy Evolution at NetflixAPI Strategy Evolution at Netflix
API Strategy Evolution at NetflixMichael Hart
 
Redesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONRedesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONDaniel Jacobson
 
API Design - When to buck the trend (Webcast)
API Design - When to buck the trend (Webcast)API Design - When to buck the trend (Webcast)
API Design - When to buck the trend (Webcast)Apigee | Google Cloud
 
Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedSangeeta Narayanan
 
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix APIAPI World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix APIBenjamin Schmaus
 
The future-of-netflix-api
The future-of-netflix-apiThe future-of-netflix-api
The future-of-netflix-apiDaniel Jacobson
 
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Atlassian
 
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Atlassian
 
Gluecon 2013 netflix api crash course
Gluecon 2013   netflix api crash courseGluecon 2013   netflix api crash course
Gluecon 2013 netflix api crash courseBenjamin Schmaus
 
Stranger Things: The Forces that Disrupt Netflix
Stranger Things: The Forces that Disrupt NetflixStranger Things: The Forces that Disrupt Netflix
Stranger Things: The Forces that Disrupt NetflixC4Media
 
Netapp Michael Galpin
Netapp Michael GalpinNetapp Michael Galpin
Netapp Michael Galpinrajivmordani
 
Move Fast;Stay Safe:Developing & Deploying the Netflix API
Move Fast;Stay Safe:Developing & Deploying the Netflix APIMove Fast;Stay Safe:Developing & Deploying the Netflix API
Move Fast;Stay Safe:Developing & Deploying the Netflix APISangeeta Narayanan
 
Why your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSyncWhy your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSyncYan Cui
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventSudhir Tonse
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...DataStax
 
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017Amazon Web Services
 
Evolution of the Netflix API
Evolution of the Netflix APIEvolution of the Netflix API
Evolution of the Netflix APIC4Media
 

Similar to Scaling the Netflix API (20)

Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
 
Set Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPRSet Your Content Free! : Case Studies from Netflix and NPR
Set Your Content Free! : Case Studies from Netflix and NPR
 
Netflix API - Presentation to PayPal
Netflix API - Presentation to PayPalNetflix API - Presentation to PayPal
Netflix API - Presentation to PayPal
 
API Strategy Evolution at Netflix
API Strategy Evolution at NetflixAPI Strategy Evolution at Netflix
API Strategy Evolution at Netflix
 
Redesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCONRedesigning the Netflix API - OSCON
Redesigning the Netflix API - OSCON
 
API Design - When to buck the trend (Webcast)
API Design - When to buck the trend (Webcast)API Design - When to buck the trend (Webcast)
API Design - When to buck the trend (Webcast)
 
Oscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons LearnedOscon2014 Netflix API - Top 10 Lessons Learned
Oscon2014 Netflix API - Top 10 Lessons Learned
 
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix APIAPI World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
 
The future-of-netflix-api
The future-of-netflix-apiThe future-of-netflix-api
The future-of-netflix-api
 
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
 
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
Extend Your Use of JIRA by Solving Your Unique Concerns: An Exposé of the New...
 
Gluecon 2013 netflix api crash course
Gluecon 2013   netflix api crash courseGluecon 2013   netflix api crash course
Gluecon 2013 netflix api crash course
 
Stranger Things: The Forces that Disrupt Netflix
Stranger Things: The Forces that Disrupt NetflixStranger Things: The Forces that Disrupt Netflix
Stranger Things: The Forces that Disrupt Netflix
 
Netapp Michael Galpin
Netapp Michael GalpinNetapp Michael Galpin
Netapp Michael Galpin
 
Move Fast;Stay Safe:Developing & Deploying the Netflix API
Move Fast;Stay Safe:Developing & Deploying the Netflix APIMove Fast;Stay Safe:Developing & Deploying the Netflix API
Move Fast;Stay Safe:Developing & Deploying the Netflix API
 
Why your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSyncWhy your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSync
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
 
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017
NEW LAUNCH! Introducing Amazon Kinesis Video Streams - ABD216 - re:Invent 2017
 
Evolution of the Netflix API
Evolution of the Netflix APIEvolution of the Netflix API
Evolution of the Netflix API
 

More from Daniel Jacobson

Why API? - Business of APIs Conference
Why API? - Business of APIs ConferenceWhy API? - Business of APIs Conference
Why API? - Business of APIs ConferenceDaniel Jacobson
 
API Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignAPI Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignDaniel Jacobson
 
NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010Daniel Jacobson
 
NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010Daniel Jacobson
 
NPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyNPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyDaniel Jacobson
 
NPR API Usage and Metrics
NPR API Usage and MetricsNPR API Usage and Metrics
NPR API Usage and MetricsDaniel Jacobson
 
OpenID Adoption UX Summit
OpenID Adoption UX SummitOpenID Adoption UX Summit
OpenID Adoption UX SummitDaniel Jacobson
 

More from Daniel Jacobson (9)

Why API? - Business of APIs Conference
Why API? - Business of APIs ConferenceWhy API? - Business of APIs Conference
Why API? - Business of APIs Conference
 
Netflix API
Netflix APINetflix API
Netflix API
 
API Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API RedesignAPI Revolutions : Netflix's API Redesign
API Revolutions : Netflix's API Redesign
 
NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010NPR Presentation at Wolfram Data Summit 2010
NPR Presentation at Wolfram Data Summit 2010
 
NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010NPR: Digital Distribution Strategy: OSCON2010
NPR: Digital Distribution Strategy: OSCON2010
 
NPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile StrategyNPR's Digital Distribution and Mobile Strategy
NPR's Digital Distribution and Mobile Strategy
 
NPR API Usage and Metrics
NPR API Usage and MetricsNPR API Usage and Metrics
NPR API Usage and Metrics
 
OpenID Adoption UX Summit
OpenID Adoption UX SummitOpenID Adoption UX Summit
OpenID Adoption UX Summit
 
NPR : Examples of COPE
NPR : Examples of COPENPR : Examples of COPE
NPR : Examples of COPE
 

Recently uploaded

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Scaling the Netflix API

Editor's Notes

  1. For Netflix, our API strategy can be discussed in the form of an iceberg. The public API strategy is the prominent, highly visible part of the iceberg that is above water. It is also the smallest part of the iceberg, in terms of mass. Meanwhile, the large mass of ice underwater that you cannot see is the most critical and biggest part of the iceberg. The visible part of the iceberg represents the public API and the part underwater is the internal strategy. Netflix’s strategy over the years has shifted from public to internal.
  2. There are many ways to think of “scale”…
  3. People generally think of scaling as growth in traffic…
  4. And traffic growth typically equals server growth. System scaling requires a balance between need and capacity.
  5. As a result, as traffic grows over time, scaling effectively requires adding capacity to support it.
  6. To have an effective engineering organization, you need to scale in a variety of ways, not just in your systems. This presentation discusses these scaling needs. Of course, I will focus a bit on systems, but that is not the only area that requires focus to be successful.
  7. Netflix’s focus is on providing the best global streaming video experience for TV shows and movies.
  8. We now have more than 36 million global subscribers in more than 40 countries.
  9. Those subscribers consume more than a billion hours of streaming video a month which accounts for about 33% of the peak Internet traffic in North America.
  10. We are also on more than 1,000 different device types.
  11. With that scale, it is critical to have the right organizational structure to support it.
  12. Our organization, and therefore our system, is set up to support a distributed architecture.
  13. I like to think of the Netflix product engineering teams that support development and innovation as being shaped like an hourglass…
  14. In the top end of the hourglass, we have our device and UI teams who build out great user experiences on Netflix-branded devices. To put that into perspective, there are a few hundred more device types that we support than engineers at Netflix.
  15. At the bottom end of the hourglass, there are several dozen dependency teams who focus on things like metadata, algorithms, authentication services, A/B test engines, etc.
  16. The API is at the center of the hourglass, acting as a broker of data.
  17. This hourglass architecture allows us to scale horizontal easily with our device integrations and our backend dependencies.
  18. The glue that helps all of this work as effectively as it does is our engineering culture. We hire great, seasoned engineers with excellent judgment and engineering acumen and enable them to build systems quickly by giving them the right context and helping them work together in a highly aligned way.
  19. With the adoption of the devices, API traffic took off! We went from about 600 million requests per month to about 42 BILLION requests in just two years.
  20. Fast forward, those numbers have grown to more than 2B incoming requests per day, which translates into more than 14B outbound calls to our dependencies per day.
  21. Rather than relying on data centers, we have moved everything to the cloud! Enables rapid scaling with relative ease. Adding new servers, in new locations, take minutes. And this is critical when the service needs to grow from 1B requests a month to 2B requests a day in a relatively short period of time.
  22. That is much more preferable for us than spending our time, money and energy in data centers, adding servers, dealing with power supplies, etc.
  23. Instead, we spend time in tools such as Asgard, created by Netflix staff, to help us manage our instance types and counts in AWS. Asgard is available in our open source repository at github.
  24. Another feature afforded to us through AWS to help us scale is Autoscaling. This is the Netflix API request rates over a span of time. The red line represents a potential capacity needed in a data center to ensure that the spikes could be handled without spending a ton more than is needed for the really unlikely scenarios.
  25. Through autoscaling, instead of buying new servers based on projected spikes in traffic and having systems administrators add them to the farm, the cloud can dynamically and automatically add and remove servers based on need.
  26. Our distributed architecture, with the number of systems involved, can get quite complicated. Each of these systems talks to a large number of other systems within our architecture.
  27. Because of this complexity, the API can play a key role in ensuring overall system resiliency. None of our dependency services have SLAs of 100%. Given our unique position in the hourglass as being the last point just before delivery to our users, the API can serve a critical role in protecting our customers from various failures throughout the system.
  28. In the old world, the system was vulnerable to such failures. For example, if one of our dependency services fails…
  29. Such a failure could have resulted in an outage in the API.
  30. And that outage likely would have cascaded to have some kind of substantive impact on the devices.
  31. The challenge for the API team is to be resilient against dependency outages, to ultimately insulate Netflix customers from low level system problems and to keep them happy.
  32. To solve this problem, we created Hystrix, as wrapping technology that provides fault tolerance in a distributed environment. Hystrix is also open source and available at our github repository.
  33. To achieve this, we implemented a series of circuit breakers for each library that we depend on. Each circuit breaker controls the interaction between the API and that dependency. This image is a view of the dependency monitor that allows us to view the health and activity of each dependency. This dashboard is designed to give a real-time view of what is happening with these dependencies (over the last two minutes). We have other dashboards that provide insight into longer-term trends, day-over-day views, etc.
  34. So, going back to the engineering diagram…
  35. If that same service fails today…
  36. We simply disconnect from that service.
  37. And replace it with an appropriate fallback. The fallback, ideally is a slightly degrade, but useful offering. If we cannot get that, however, we will quickly provide a 5xx response which will help the systems shed load rather than queue things up (which could eventually cause the system as a whole to tip over).
  38. This will keep ourcustomers happy, even if the experience may be slightly degraded. It is important to note that different dependency libraries have different fallback scenarios. And some are more resilient than others. But the overall sentiment here is accurate at a high level.
  39. Hystrix and other techniques throughout our engineering organization help keep things resilient. We also have an army of tools that introduce failures to the system which will help us identify problems before they become really big problems.
  40. The army is the Simian Army, which is a fleet of monkeys who are designed to do a variety of things, in an automated way, in our cloud implementation. Chaos Monkey, for example, periodically terminates AWS instances in production to see how the system as a whole will respond once that server disappears. Latency Monkey introduces latencies and errors into a system to see how it responds. The system is too complex to know how things will respond in various circumstances, so the monkeys expose that information to us in a variety of ways. The monkeys are also available in our open source github repository.
  41. Going global has a different set of scaling challenges. AWS enables us to add instances in new regions that are closer to our customers.
  42. To help us manage our traffic across regions, as well as within given regions, we created Zuul. Zuul is open source in our github repository.
  43. Zuul does a variety of things for us. Zuul fronts our entire streaming application as well as a range of other services within our system.
  44. Moreover, Zuul is the routing engine that we use for Isthmus, which is designed to marshall traffic between regions, for failover, performance or other reasons.
  45. Again, we have more than 1,000 different device types that we support. Across those devices, there is a high degree of variability. As a result, we have seen inefficiencies and problems emerge across our implementations. Those issues also translate into issues with the API interaction.
  46. For example, screen size could significantly affect what the API should deliver to the UI. TVs with bigger screens that can potentially fit more titles and more metadata per title than a mobile phone. Do we need to send all of the extra bits for fields or items that are not needed, requiring the device itself to drop items on the floor? Or can we optimize the deliver of those bits on a per-device basis?
  47. Different devices have different controlling functions as well. For devices with swipe technologies, such as the iPad, do we need to pre-load a lot of extra titles in case a user swipes the row quickly to see the last of 500 titles in their queue? Or for up-down-left-right controllers, would devices be more optimized by fetching a few items at a time when they are needed? Other devices support voice or hand gestures or pointer technologies. How might those impact the user experience and therefore the metadata needed to support them?
  48. The technical specs on these devices differ greatly. Some have significant memory space while others do not, impacting how much data can be handled at a given time. Processing power and hard-drive space could also play a role in how the UI performs, in turn potentially influencing the optimal way for fetching content from the API. All of these differences could result in different potential optimizations across these devices.
  49. Many UI teams needing metadata means many requests to the API team. In the one-size-fits-all API world, we essentially needed to funnel these requests and then prioritize them. That means that some teams would need to wait for API work to be done. It also meant that, because they all shared the same endpoints, we were often adding variations to the endpoints resulting in a more complex system as well as a lot of spaghetti code. Make teams wait due to prioritization was exacerbated by the fact that tasks took longer because the technical debt was increasing, causing time to build and test to increase. Moreover, many of the incoming requests were asking us to do more of the same kinds of customizations. This created a spiral that would be very difficult to break out of…
  50. That variability ultimately caused us to do some introspection on our API layer.
  51. Many other companies have seen similar issues and have introduced orchestration layers that enable more flexible interaction models.
  52. Odata, HYQL, ql.io, rest.li and others are examples of orchestration layers. They address the same problems that we have seen, but we have approached the solution in a very different way.
  53. We evolved our discussion towards what ultimately became a discussion between resource-based APIs and experience-based APIs.
  54. The original OSFA API was very resource oriented with granular requests for specific data, delivering specific documents in specific formats.
  55. The interaction model looked basically like this, with (in this example) the PS3 making many calls across the network to the OSFA API. The API ultimately called back to dependent services to get the corresponding data needed to satisfy the requests.
  56. In this mode, there is a very clear divide between the Client Code and the Server Code. That divide is the network border.
  57. And the responsibilities have the same distribution as well. The Client Code handles the rendering of the interface (as well as asking the server for data). The Server Code is responsible of gathering, formatting and delivering the data to the UIs.
  58. And ultimately, it works. The PS3 interface looks like this and was populated by this interaction model.
  59. But we believe this is not the optimal way to handle it. In fact, assembling a UI through many resource-based API calls is akin to pointillism paintings. The picture looks great when fully assembled, but it is done by assembling many points put together in the right way.
  60. We have decided to pursue an experience-based approach instead. Rather than making many API requests to assemble the PS3 home screen, the PS3 will potentially make a single request to a custom, optimized endpoint.
  61. In an experience-based interaction, the PS3 can potentially make asingle request across the network border to a scripting layer (currently Groovy), in this example to provide the data for the PS3 home screen. The call goes to a very specific, custom endpoint for the PS3 or for a shared UI. The Groovy script then interprets what is needed for the PS3 home screen and triggers a series of calls to the Java API running in the same JVM as the Groovy scripts. The Java API is essentially a series of methods that individually know how to gather the corresponding data from the dependent services. The Java API then returns the data to the Groovy script who then formats and delivers the very specific data back to the PS3.
  62. We also introduced RxJava into this layer to improve our ability to handle concurrency and callbacks. RxJava is open source in our github repository.
  63. In this model, the border between Client Code and Server Code is no longer the network border. It is now back on the server. The Groovy is essentially a client adapter written by the client teams.
  64. And the distribution of work changes as well. The client teams continue to handle UI rendering, but now are also responsible for the formatting and delivery of content. The API team, in terms of the data side of things, is responsible for the data gathering and hand-off to the client adapters. Of course, the API team does many other things, including resiliency, scaling, dependency interactions, etc. This model is essentially a platform for API development.
  65. If resource-based APIs assemble data like pointillism, experience-based APIs assemble data like a photograph. The experience-based approach captures and delivers it all at once.
  66. Again, the dependency chains in our system are quite complicated. The API alone includes more than 500 client jars from dependencies when building the application. That complexity makes it virtually impossible to maintain a high degree of nimbleness while having a very high confidence in testing and the code that will be deployed.
  67. As a result, our philosophy is to act fast (ie. get code into production as quickly as possible), then react fast (ie. response to issues quickly as they arise).
  68. That said, we do spend a lot of time testing. We just don’t intend to make the system bullet-proof before deploying. Instead, we have employed some techniques to help us learn more about what the new code will look like in production.
  69. Two such examples are canary deployments and what we call red/black deployments.
  70. The canary deployments are comparable to canaries in coal mines. We have many servers in production running the current codebase. We will then introduce a single (or perhaps a few) new server(s) into production running new code. Monitoring the canary servers will show what the new code will look like in production.
  71. If the canary encounters problems, it will register in any number of ways. The problems will be determined based on a comprehensive set of tools that will automatically perform health analysis on the canary.
  72. If the canary shows errors, we pull it/them down, re-evaluate the new code, debug it, etc.
  73. We will then repeat the process until the analysis of canary servers look good.
  74. If the new code looks good in the canary, we can then use a technique that we call red/black deployments to launch the code. Start with red, where production code is running. Fire up a new set of servers (black) equal to the count in red with the new code.
  75. Then switch the pointer to have external requests point to the black servers. Sometimes, however, we may find an error in the black cluster that was not detected by the canary. For example, some issues can only be seen with full load.
  76. If a problem is encountered from the black servers, it is easy to rollback quickly by switching the pointer back to red. We will then re-evaluate the new code, debug it, etc.
  77. Once we have debugged the code, we will put another canary up to evaluate the new changes in production.
  78. If the new code looks good in the canary, we can then bring up another set of servers with the new code.
  79. Then we will switch production traffic to the new code.
  80. If everything still looks good, we disable the red servers and the new code becomes the new red servers.
  81. All of the open source components discussed here, as well as many others, can be found at the Netflix github repository.