SlideShare a Scribd company logo
Daniel Jacobson
• @daniel_jacobson
• http://www.linkedin.com/in/danieljacobson
• http://www.slideshare.net/danieljacobson
Sangeeta Narayanan
• @sangeetan
• http://www.linkedin.com/in/sangeetanarayanan/
Strategy
Lessons
Implementation
Lessons
Know Your Audience
Target Audience Dictates
Everything Else
The target audience should be
the single biggest influence on
your API design
Small Set of Known Developers
SSKDs
Large Set of Unknown Developers
LSUDs
Both
SSKDs and LSUDs
No matter what…
Figure this out first!
Target Audience Influence
• Team Identity
• Staffing Decisions
• System Architecture
• SLAs
• Development Velocity
• Security Needs
Netflix API : Key Responsibilities
2008
• Broker data between internal services and
public developers
• Grow community of public developers
• Optimize design for reusability
Evangelists
Partner
Engagement
and Support
API
Engineers
Technical
Writer
QA
Specialists
Private API Public API
< 0.3% of total
API traffic *
* 11 years worth of public API requests = one day of private API requests
Netflix API : Key Responsibilities
Today
• Broker data between services and devices
• System resiliency
• Scaling the system
• High velocity development
• Insights
The consumers of the API are now
Netflix subscribers
We are now responsible for ensuring
subscribers can stream
Application
Engineers
Platform
Engineers
Technical
Writer
Tools and
Automation
Engineers
Team is now 6x its
size from 2010
Separation of Concerns
Primary Responsibilities of APIs
• Data Gathering
– Retrieving the requested data from one or many local
or remote data sources
• Data Formatting
– Preparing a structured payload to the requesting
agent
• Data Delivery
– Delivering the structured payload to the requesting
agent
There are two players in APIs
API Provider API Consumer
API Provider
PROVIDES
API Consumer
CONSUMES
Traditional API Interactions
API Provider
PROVIDES
EVERYTHING
API Consumer
CONSUMES
Everything means, API Provider does:
• Data Gathering
• Data Formatting
• Data Delivery
• (among other things)
Traditional API Interactions
Why do most API providers provide
everything?
• Many APIs have a large set of unknown and
external developers
• Generic API design tends to be easier for
teams closer to the source
• Centralized API functions makes them easier
to support
Why do most API providers provide
everything?
• Many APIs have a large set of unknown and
external developers
• Generic API design tends to be easier for
teams closer to the source
• Centralized API functions makes them easier
to support
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data is
gathered, as long as it
is gathered
Each consumer cares a
lot about the format
for that specific use
Each consumer cares a
lot about how payload
is delivered
API Provider
Care a lot about how
the data is gathered
Only cares about the
format to the extent it
is easy to support
Only cares that the
delivery method is
easy to support
Separation of Concerns
To be a better provider, the API should address the
separation of concerns of the three core functions
One Size Doesn’t Fit All
Embrace the Differences
Data Gathering Data Formatting Data Delivery
API Consumer
Don’t care how data is
gathered, as long as it
is gathered
Each consumer cares a
lot about the format
for that specific use
Each consumer cares a
lot about how payload
is delivered
API Provider
Care a lot about how
the data is gathered
Only cares about the
format to the extent it
is easy to support
Only cares that the
delivery method is
easy to support
Separation of Concerns
Screen Real Estate
Controllers
Technical Capabilities
Resource-Based API
vs.
Experience-Based API
Resource-Based Requests
• /users/<id>/ratings/title
• /users/<id>/queues
• /users/<id>/queues/instant
• /users/<id>/recommendations
• /catalog/titles/movie
• /catalog/titles/series
• /catalog/people
REST API
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Network Border Network Border
Experience-Based Requests
• /ps3/homescreen
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Client Adapter Code
Be Pragmatic, Not Dogmatic
Common API Debates
• XML / JSON
• REST / SOAP
• OAuth / Other
• Versioning
• Hypermedia
Who Cares!?!?
Just Solve Problems for your
Audience
Embrace Change
Impermanence and Versionless APIs
v1.0
v1.5
v2.0
Versioning for APIs
1.0
1.5
2.0
3.0?
4.0?
5.0?
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Eliminate Versioning?
1.0
1.5
2.0
New Architecture
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Act Fast, React Fast
Favor Velocity Over Completeness
Delivery Using Buckets
Testing
Production Traffic
Old Code (Baseline) New Code (Canary)
~1% Traffic
Deployments
Old Code New Code
Production Traffic
Enable Others to
Act Fast, React Fast
JAVA API
Network Border Network Border
RECOMME
NDATIONS
MOVIE
DATA
SIMILAR
MOVIES
AUTH
MEMBER
DATA
A/B
TESTS
START-
UP
RATINGS
Dynamically deployed
endpoints
Statically deployed
libraries
Dynamically deployed
endpoints
Dependency Canaries
Personalization
Service
API
• Build
• Test
• Deploy Service
• Release Lib
Pers.
Lib
• Integrate Lib
• Build
• Test
• Deploy Service
UI Script
Iterations in Hours or Days
Access
Data
Personalization
Service
API
• Build
• Test
• Deploy Service
• Release Lib
• Publish to API
Pers.
Lib
UI Script
Iterations in Minutes?
Access
Data
• Integrate Lib
• Build
• Test
• Deploy Service
Internal Developers Need
Engagement Too
Documentation
Tools
REPL:
Trainings
Failure is Inevitable
~5,000,000,000
Requests per day
~35
Dependencies
~600
Libraries
Things will break!
Scale at All Costs
-
10
20
30
40
50
60
June, 2010 June, 2011 June, 2012
RequestsinBillions
API Requests Per Month
Incoming Traffic
Predictive Auto Scaling
Predicted vs. Actual RPS
Reactive + Predictive Autoscaling
No. of instances
1. Know Your Audience
2. Separation of Concerns
3. One Size Doesn’t Fit All
4. Be Pragmatic, Not
Dogmatic
5. Embrace Change
1. Act Fast, React Fast
2. Enable Others to Act
Fast, React Fast
3. Internal Developers
Need Engagement Too
4. Failure is Inevitable
5. Scale at All Costs
http://github.com/Netflix
Daniel Jacobson
• @daniel_jacobson
• http://www.linkedin.com/in/danieljacobson
• http://www.slideshare.net/danieljacobson
Sangeeta Narayanan
• @sangeetan
• http://www.linkedin.com/in/sangeetanarayanan/

More Related Content

What's hot

State of the API: Insights Into the Future of APIs
State of the API: Insights Into the Future of APIsState of the API: Insights Into the Future of APIs
State of the API: Insights Into the Future of APIs
Postman
 
Making Microservices work at Netflix
Making Microservices  work at NetflixMaking Microservices  work at Netflix
Making Microservices work at Netflix
Sangeeta Narayanan
 
API Design Workflows
API Design WorkflowsAPI Design Workflows
API Design Workflows
Jakub Nesetril
 
AWS Summit - Trends in Advanced Monitoring for AWS environments
AWS Summit - Trends in Advanced Monitoring for AWS environmentsAWS Summit - Trends in Advanced Monitoring for AWS environments
AWS Summit - Trends in Advanced Monitoring for AWS environments
Andreas Grabner
 
Your API Strategy: Why Boring is Best
Your API Strategy: Why Boring is BestYour API Strategy: Why Boring is Best
Your API Strategy: Why Boring is Best
Nordic APIs
 
What are your APIs Worth?
What are your APIs Worth?What are your APIs Worth?
What are your APIs Worth?
Apigee | Google Cloud
 
Cutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing InnovationCutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing Innovation
Apigee | Google Cloud
 
Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re Everywhere
Nordic APIs
 
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
apidays
 
I Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical SessionsI Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical Sessions
Apigee | Google Cloud
 
Webcast: Apigee Edge Product Demo
Webcast: Apigee Edge Product DemoWebcast: Apigee Edge Product Demo
Webcast: Apigee Edge Product Demo
Apigee | Google Cloud
 
Using APIs to Program Disparate IoT Devices
Using APIs to Program Disparate IoT DevicesUsing APIs to Program Disparate IoT Devices
Using APIs to Program Disparate IoT Devices
Apigee | Google Cloud
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale
3scale
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Dynatrace
 
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
Gene Kim
 
Analytics Services: Measuring Anything, Anywhere...
Analytics Services: Measuring Anything, Anywhere...Analytics Services: Measuring Anything, Anywhere...
Analytics Services: Measuring Anything, Anywhere...
Apigee | Google Cloud
 
APIdays Paris 2018 - Make a building smart with API and serverless microservi...
APIdays Paris 2018 - Make a building smart with API and serverless microservi...APIdays Paris 2018 - Make a building smart with API and serverless microservi...
APIdays Paris 2018 - Make a building smart with API and serverless microservi...
apidays
 
ACA-Mobile - Creating Enterprise Apps with MADP
ACA-Mobile - Creating Enterprise Apps with MADPACA-Mobile - Creating Enterprise Apps with MADP
ACA-Mobile - Creating Enterprise Apps with MADP
ACA IT-Solutions
 
Kong Summit 2021 - Opening Keynote
Kong Summit 2021 - Opening KeynoteKong Summit 2021 - Opening Keynote
Kong Summit 2021 - Opening Keynote
Ivan Rylach
 
Microservices in action: How to actually build them
Microservices in action: How to actually build themMicroservices in action: How to actually build them
Microservices in action: How to actually build them
3scale
 

What's hot (20)

State of the API: Insights Into the Future of APIs
State of the API: Insights Into the Future of APIsState of the API: Insights Into the Future of APIs
State of the API: Insights Into the Future of APIs
 
Making Microservices work at Netflix
Making Microservices  work at NetflixMaking Microservices  work at Netflix
Making Microservices work at Netflix
 
API Design Workflows
API Design WorkflowsAPI Design Workflows
API Design Workflows
 
AWS Summit - Trends in Advanced Monitoring for AWS environments
AWS Summit - Trends in Advanced Monitoring for AWS environmentsAWS Summit - Trends in Advanced Monitoring for AWS environments
AWS Summit - Trends in Advanced Monitoring for AWS environments
 
Your API Strategy: Why Boring is Best
Your API Strategy: Why Boring is BestYour API Strategy: Why Boring is Best
Your API Strategy: Why Boring is Best
 
What are your APIs Worth?
What are your APIs Worth?What are your APIs Worth?
What are your APIs Worth?
 
Cutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing InnovationCutting Agency IT Costs, Growing Innovation
Cutting Agency IT Costs, Growing Innovation
 
Pain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re EverywherePain Points In API Development? They’re Everywhere
Pain Points In API Development? They’re Everywhere
 
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
apidays LIVE Paris 2021 - Automating API Documentation by Ajinkya Marudwar, G...
 
I Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical SessionsI Love APIs Europe 2015: Technical Sessions
I Love APIs Europe 2015: Technical Sessions
 
Webcast: Apigee Edge Product Demo
Webcast: Apigee Edge Product DemoWebcast: Apigee Edge Product Demo
Webcast: Apigee Edge Product Demo
 
Using APIs to Program Disparate IoT Devices
Using APIs to Program Disparate IoT DevicesUsing APIs to Program Disparate IoT Devices
Using APIs to Program Disparate IoT Devices
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...
 
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
DOES16 San Francisco - Marc Ng - SAP’s DevOps Journey: From Building an App t...
 
Analytics Services: Measuring Anything, Anywhere...
Analytics Services: Measuring Anything, Anywhere...Analytics Services: Measuring Anything, Anywhere...
Analytics Services: Measuring Anything, Anywhere...
 
APIdays Paris 2018 - Make a building smart with API and serverless microservi...
APIdays Paris 2018 - Make a building smart with API and serverless microservi...APIdays Paris 2018 - Make a building smart with API and serverless microservi...
APIdays Paris 2018 - Make a building smart with API and serverless microservi...
 
ACA-Mobile - Creating Enterprise Apps with MADP
ACA-Mobile - Creating Enterprise Apps with MADPACA-Mobile - Creating Enterprise Apps with MADP
ACA-Mobile - Creating Enterprise Apps with MADP
 
Kong Summit 2021 - Opening Keynote
Kong Summit 2021 - Opening KeynoteKong Summit 2021 - Opening Keynote
Kong Summit 2021 - Opening Keynote
 
Microservices in action: How to actually build them
Microservices in action: How to actually build themMicroservices in action: How to actually build them
Microservices in action: How to actually build them
 

Similar to Oscon2014 Netflix API - Top 10 Lessons Learned

Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Daniel Jacobson
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
Benjamin Schmaus
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Daniel Jacobson
 
API Strategy Introduction
API Strategy IntroductionAPI Strategy Introduction
API Strategy Introduction
Doug Gregory
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
apidays
 
Best Practices for Salesforce Data Access
Best Practices for Salesforce Data AccessBest Practices for Salesforce Data Access
Best Practices for Salesforce Data Access
Salesforce Developers
 
API ARU-ARU
API ARU-ARUAPI ARU-ARU
Service api design validation & collaboration
Service api design validation & collaborationService api design validation & collaboration
Service api design validation & collaboration
Uchit Vyas ☁
 
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMGapidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays
 
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays
 
Building the Eventbrite API Ecosystem
Building the Eventbrite API EcosystemBuilding the Eventbrite API Ecosystem
Building the Eventbrite API Ecosystem
Mitch Colleran
 
Flavours of APIs
Flavours of APIs Flavours of APIs
Flavours of APIs
Chris Phillips
 
Building a REST API for Longevity
Building a REST API for LongevityBuilding a REST API for Longevity
Building a REST API for Longevity
MuleSoft
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product Strategy
Ravi Kumar
 
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
apidays
 
Enterprise API Adoption Patterns
Enterprise API Adoption PatternsEnterprise API Adoption Patterns
Enterprise API Adoption PatternsAkana
 
Enterprise API Adoption Patterns
Enterprise API Adoption PatternsEnterprise API Adoption Patterns
Enterprise API Adoption Patterns
Akana
 
Recipes for API Ninjas
Recipes for API NinjasRecipes for API Ninjas
Recipes for API Ninjas
Nordic APIs
 
Portal and Intranets
Portal and Intranets Portal and Intranets
Portal and Intranets
Redar Ismail
 
Netflix MSA and Pivotal
Netflix MSA and PivotalNetflix MSA and Pivotal
Netflix MSA and Pivotal
VMware Tanzu Korea
 

Similar to Oscon2014 Netflix API - Top 10 Lessons Learned (20)

Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014Top 10 Lessons Learned from the Netflix API - OSCON 2014
Top 10 Lessons Learned from the Netflix API - OSCON 2014
 
Maintaining the Front Door to Netflix
Maintaining the Front Door to NetflixMaintaining the Front Door to Netflix
Maintaining the Front Door to Netflix
 
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit MeetupMaintaining the Netflix Front Door - Presentation at Intuit Meetup
Maintaining the Netflix Front Door - Presentation at Intuit Meetup
 
API Strategy Introduction
API Strategy IntroductionAPI Strategy Introduction
API Strategy Introduction
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
 
Best Practices for Salesforce Data Access
Best Practices for Salesforce Data AccessBest Practices for Salesforce Data Access
Best Practices for Salesforce Data Access
 
API ARU-ARU
API ARU-ARUAPI ARU-ARU
API ARU-ARU
 
Service api design validation & collaboration
Service api design validation & collaborationService api design validation & collaboration
Service api design validation & collaboration
 
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMGapidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
apidays LIVE New York 2021 - Service API design validation by Uchit Vyas, KPMG
 
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
 
Building the Eventbrite API Ecosystem
Building the Eventbrite API EcosystemBuilding the Eventbrite API Ecosystem
Building the Eventbrite API Ecosystem
 
Flavours of APIs
Flavours of APIs Flavours of APIs
Flavours of APIs
 
Building a REST API for Longevity
Building a REST API for LongevityBuilding a REST API for Longevity
Building a REST API for Longevity
 
APIs as a Product Strategy
APIs as a Product StrategyAPIs as a Product Strategy
APIs as a Product Strategy
 
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
apidays Paris 2022 - Sustainable API Green Score, Yannick Tremblais (Groupe R...
 
Enterprise API Adoption Patterns
Enterprise API Adoption PatternsEnterprise API Adoption Patterns
Enterprise API Adoption Patterns
 
Enterprise API Adoption Patterns
Enterprise API Adoption PatternsEnterprise API Adoption Patterns
Enterprise API Adoption Patterns
 
Recipes for API Ninjas
Recipes for API NinjasRecipes for API Ninjas
Recipes for API Ninjas
 
Portal and Intranets
Portal and Intranets Portal and Intranets
Portal and Intranets
 
Netflix MSA and Pivotal
Netflix MSA and PivotalNetflix MSA and Pivotal
Netflix MSA and Pivotal
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 

Oscon2014 Netflix API - Top 10 Lessons Learned

Editor's Notes

  1. The lessons that we discuss in these slides fall into two buckets: strategy and implementation.
  2. In some cases, the audience will be a small set of known developers (SSKDs). These developers are generally engineers within your company or one with whom you are partnering.
  3. In other cases, the audience may be a large set of unknown developers. This audience is typically associated with public APIs.
  4. And in some cases, the API will target both audience types.
  5. This is a short list of the things that the target audience will influence.
  6. For Netflix, we started out with a public API, with the audience being a large set of unknown developers. There were no internal use cases at launch.
  7. Based on the target audience of unknown developers, we staffed accordingly. The team was relatively small, with skills around development, evangelism, partnering, testing and documentation.
  8. As streaming became more critical to the company, we started having devices use the API. Our first mistake was that we were probably too late to pivot our architecture based on our change in target audience. At the time, we had many devices call into our REST API, the same one that we used for the unknown developers.
  9. But eventually, the data demonstrated that the architectural change was needed. This chart shows that the private API completely drarfs the public API in terms of requests. The private API does about five billion requests per day while the public API does between one and two million. This disparity clearly demonstrates the need for us to target the API to the small set of known developers – Netflix’s UI engineers – who build the vast majority of the experiences on Netflix devices.
  10. Given the shift in responsibilities, we positioned the team accordingly, hiring for skills mostly around engineering.
  11. And the team size grew by about 6x in the last few years. If the target audience was still the public API, it is likely that the team size would have grown, but less significantly (perhaps 2x) in that time frame.
  12. API consumers care a lot about data formatting and delivery, but each consumer, in such a diverse ecosystem, cares about them differently. For some devices, they may want an XML payload delivered as a complete document, while others may need JSON, protobuffer or some other format, potentially delivered as streamed bits. Because of these diverse needs, we need to separate out the concerns to better enable the consumers to get what they need.
  13. Most companies focus on a small handful of device implementations, most notably Android and iOS devices.
  14. At Netflix, we have more than 1,000 different device types that we support. Across those devices, there is a high degree of variability. As a result, we have seen inefficiencies and problems emerge across our implementations. Those issues also translate into issues with the API interaction.
  15. For example, screen size could significantly affect what the API should deliver to the UI. TVs with bigger screens that can potentially fit more titles and more metadata per title than a mobile phone. Do we need to send all of the extra bits for fields or items that are not needed, requiring the device itself to drop items on the floor? Or can we optimize the deliver of those bits on a per-device basis? Different devices have different controllers as well. Some, like the iPad, allow for fast swipe interactions so the content needs to be there for the entire row. Other devices, like smart TVs or game some game consoles have LRUD controllers, so it at least gives the opportunity to fetch the data as the row gets navigated. And the technical capabilities of the devices will influence the interactions as well. Some have more computing power or memory which will influence how much data you can process on the device vs. how much needs to be gathered in real-time.
  16. We evolved our discussion towards what ultimately became a discussion between resource-based APIs and experience-based APIs.
  17. The original one-size-fits-all API was very resource oriented with granular requests for specific data, delivering specific documents in specific formats.
  18. The interaction model looked basically like this, with (in this example) the PS3 making many calls across the network to the OSFA API. The API ultimately called back to dependent services to get the corresponding data needed to satisfy the requests.
  19. We have decided to pursue an experience-based approach instead. Rather than making many API requests to assemble the PS3 home screen, the PS3 will potentially make a single request to a custom, optimized endpoint.
  20. In an experience-based interaction, the PS3 can potentially make a single request across the network border to a scripting layer (currently Groovy), in this example to provide the data for the PS3 home screen. The call goes to a very specific, custom endpoint for the PS3 or for a shared UI. The Groovy script then interprets what is needed for the PS3 home screen and triggers a series of calls to the Java API running in the same JVM as the Groovy scripts. The Java API is essentially a series of methods that individually know how to gather the corresponding data from the dependent services. The Java API then returns the data to the Groovy script who then formats and delivers the very specific data back to the PS3.
  21. Our original REST API had granular endpoints and generic interaction models. This leads to different versions when significant changes are made. The REST API had three primary version before our move to the experience-based API.
  22. If we persisted in the REST API, we very likely could have continued to add versions while needing to support the old ones. The need to support prior versions stems from older device implementations that may not be able to updated or retired, thus forcing us to maintain these endpoints for a long time (perhaps as long as 10 years).
  23. Our target with the experience-based API was to build an architecture that allowed us to be versionless. Through SSKDs, separation of concerns, abstraction layers, and interaction optimizations, we are able move to a deprecation model.
  24. The primary goal is to limit versioning in the device-to-server interaction. Ideally, we can deprecate effectively in the server interactions as well, but that is sometimes more difficult. Back to our architecture view, the data can now flow from the services into the Java APIs. We expose granular methods (think data elements rather than resources) to the scripting tier. If a method needs to change, we can add a new method and then work closely with the SSKDs to migrate the calling scripts, enabling us to deprecate the old method. If we are not able to move the scripts, we can insulate the devices from the change either in the Java layer or in the scripting tier.
  25. Several years ago, we were deploying changes roughly every two weeks. We would accumulate changes over that time and then drop them into production all at once. Think of it as gathering water in a bucket.
  26. What we found was that our releases were unpredictable, sometimes resulting in outages, broken functionality, or incomplete work. Accordingly, we decided to slow down, changing our release cycles to three weeks. We figured that would give us more time to test our work. In other words, we got a larger bucket.
  27. Over time, however, we learned that the longer release cycle didn’t improve predictability or quality. Instead, it just slowed us down. In response, we moved aggressively towards continuous delivery. Instead of delivering water in buckets, we had a steady stream of water from a hose. This enabled us to have smaller changes, more isolated and testable, pushed to production instead of having bigger releases with more complexity.
  28. This is how code flows through the system. We have multiple canary releases per day. Internal envs are deployed ~8 times/day in 3 AWS regions. Prod deployments happen 2-3 times/week and can be triggered on demand.
  29. This dashboard lets us track the status of our master branch at any time. Builds that fail at any step in the pipeline are stopped from going further.
  30. A quick word on Testing. We follow the ‘Operate what you Build’ model where developers are responsible for shepherding their changes all the way through to production. We provide them with the tools necessary to help them gain confidence in the quality of their code. One such tool is the automated Canary Analyzer.
  31. Canary Analysis is the process wherein a small percent of traffic is routed to the new code and its performance is compared against the old code based on 1000s of metrics.
  32. A detailed report gives further insight into potential problem areas. In this case, our canary gives a score of 87%, which means it is likely not ready for release.
  33. In tandem with canaries, we use Red/Black deployments as well.
  34. The Red/Black process allows us to run production code in one cluster while we spin up the new code in a second one. As the new code proves itself, we can route all traffic to it and eventually shut down the old. It also allows us to have a fast, automated rollback in the even that the new code is seeing problems.
  35. Our architecture enables us to move faster because of the scripting tier. But this also put us in position to help our consuming teams and dependency teams to move faster as well.
  36. Let’s peek under the hood of the API Server. Client teams deploy endpoints dynamically based on their own schedule. Their cycles are completely asynchronous of server deployments. Newly deployed endpoints are live and ready to take traffic within minutes.
  37. Endpoint Activity Dashboard shows recent deployment activity. Rollbacks can be performed in a matter of minutes as well.
  38. Our dependent services provide to us client libraries that get compiled into our JVM upon deployment. These libraries typically expose static interfaces, which means changes to the interfaces require coding and deployments without our contain. Similar to the dynamic endpoints, we also have opportunity to improve the nimbleness and velocity around these libraries.
  39. One such improvement is dependency canaries, where we are evaluating our new code against the dependencies. This is a dashboard the provides insights into these canaries.
  40. Making the interaction with the consumers of the API dynamic has led to increased agility on the UI side. We are also exploring ways to increase the speed of iteration on the dependencies side. The current interaction model uses static domain models and client libraries to handle the data flow through the API. This results in long iteration cycles for even the simplest of use cases. We are actively pursuing an approach where our dependencies will be able to expose new data by using dynamic pass-through model using a Dictionary of key values.
  41. The idea is that this model will avoid the static update cycle on the API end, thereby resulting in shorter iteration cycles. This will require investment in things like safety checks and discoverability of the API. We are instrumenting the API layer to inspect traffic at runtime and provide insights into API usage.
  42. One of the early mistakes that we made in this new architecture was not treating internal developers like we did public developers. We don’t need the same degree of evangelism, but we do need to maintain strong communications with the client teams while providing robust tools and systems to help them be better developers in our system. An example of us being late to this is represented by our endpoint dashboard. One of our teams went from having about 30 scripts to about 500 in a matter of weeks. Each of these scripts are dynamically compiled into the JVM, occupying permgen space. As the script count shot up, we hit limits in our permgen which resulted in an outage. And an outage in our layer means people cannot stream Netflix. Of course, there is nothing like an outage to kickstart new behaviors. As a result, we immediately set up alerts and then focused more heavily on building tools to support the developers.
  43. Included in that effort is comprehensive documentation.
  44. We built an array of tools as well, including this REPL.
  45. And prepared frequent trainings and videos.
  46. Nobody has a 100% SLA, so things will fail
  47. In fact, a few years ago, we have many failures on a routine basis.
  48. Many of those failures were a result of failures in a dependent service that we did a poor job of protecting against. Because we are the last step before delivering content to the customers, we have a unique opportunity to help protect customers from such failures.
  49. Hystrix allows us to be resilient to failure by implementing the bulk-heading and circuit breaker patterns. Hystrix is open source and available at our github repository.
  50. Failure Simulation and Game Day exercises are a key part of the overall story. The Simian army is a fleet of monkeys who are simulate failures and alert us to non-conformities in an automated manner. Chaos Monkey periodically terminates AWS instances in production to see how the system responds to the instance disappearing. Latency Monkey introduces latencies and errors into a service to see how it responds and lets us assess the customer quality of experience. Conformity monkey alerts us to variations in versions of application across regions. The monkeys are also available in our open source github repository
  51. Because of our pivot to the private API and the explosion of devices consuming it, our traffic grew tremendously in a few years (and continues to grow at very fast rates). Scaling our systems to support this growth is absolutely critical to the success of the company. Techniques, such as throttling are not an option because that only serves to limit the interactions from our streaming subscribers. Instead, we need to be able to handle any load that our devices throw at us. This manifests in many ways, but the following is a detail on one of them – instance scaling.
  52. Let’s go back to the traffic chart. The pattern is predictable with higher peaks on the weekends
  53. To offset these limitations, we created Scryer (not yet open sourced, but in production at Netflix). Scryer evaluates needs based on historical data (week over week, month over month metrics), adjusts instance minimums based on algorithms, and relies on Amazon Auto Scaling for unpredicted events
  54. This graph shows that Scryer’s predictions are in line with actual RPS. In production, Scryer allows us to get instances into production prior to the need (which is different than Amazon’s reactive autoscaling engine which triggers the ramp up based on immediate need, only needing to wait until server start-up is complete). Because the instances are there in advance, Scryer smooths out load averages and response times, which in turn improves the customer experience.
  55. This is an example of what Scryer looks like during an outage. When actual traffic dropped because of an outage, the reactive autoscaling engine would have downsized the farm. In this case, Scryer kept the farm sized correctly so that we were able to deal with the traffic spike after the recovery.
  56. As a side benefit (not the initial intent), Scryer also allows us to be more precise with our instance counts, reducing inefficiencies.