Transforming the Netflix API
Ben Schmaus, Netflix
October 2013, API World
bschmaus@netflix.com || @schmaus
Streaming TV Shows & Movies Globally
> 1000 Devices
1/3 of
Internet
at peak
Exclusive
Content
Almost 38 million
subscribers in over
40 countries
Personalization
Engine
User Info
Movie
Metadata
Ratings
Similar
Movies
Instant
Queue
A/B Test
Engine
API
Personalization
Engine
User Info
Movie
Metadata
Ratings
Similar
Movies
Instant
Queue
A/B Test
Engine
API
Enable UX
Innovat...
It wasn’t always this way
Foster
Innovation
from
Outside
REST for Easy Interop, Integration
RESTful
API
REST for Easy Interop, Integration
RESTful
API
Model resources like users, movies, series,
ratings, etc
REST for Easy Interop, Integration
RESTful
API
/catalog/titles
/catalog/titles/movies
/catalog/titles/series
/catalog/titl...
REST for Easy Interop, Integration
RESTful
API
/users/lists
/users/title_states
/users/ratings
Enter
Streaming
Devices
Lots of devices, lots of variety
across platforms
External apps
1 /apps/{app_id}/config
1 /users/{user_id}
1 /users/{user_id}/queues/instant/available
1 /users/{user_id}/lists
26 /catalo...
Doesn’t include JS, CSS, images,
etc.
1 /apps/{app_id}/config
1 /users/{user_id}
1 /users/{user_id}/queues/instant/availab...
Device variance: interaction models
Device variance: interaction models
& form factors
A/B Tests Add to the Challenge
“Bolt on” functionality
expand=@queue_item@title,@box_art_exists,
@short_synopsis,@directors,@cast,@episodes,@formats,
@fo...
Resource-based API wasn’t scaling
Model Resources
/catalog, /users
Model Experiences
/tv/home
Reduce network chattiness
Support device optimizations
Enable fast dev iterations
GET
/users/{user_id}/lists
apiGateway.getLists(userId)
Discrete HTTP requests pay WAN tax repeatedly
Single, optimized request; pay WAN tax once
Single, optimized request; pay WAN tax once
Client data assembly
logic pushed to server
Add server-side scripting capability
Enable independent dev and device
optimization
API
Internet /tv/home
API
/tv/home
Client Server
Internet
API
Application
Client Server
Internet /tv/home
API
Client Server
Internet
UI
Teams
Application
/tv/home
API
API
Team
Client Server
Internet
UI
Teams
Application
/tv/home
API
UI
Teams
API
Team
Mid-tier
Services
Client Server
Internet
Application
/tv/home
RxJava Hystrix
API
Team
JavaServiceLayer
Mid-tier
Services
UI
Teams
Client Server
Internet
Application
/tv/home
ELB Zuul
Mid-tier
Services
Scriptable
Backend
Scriptable
Backend
+
API Layer
https://github.com/Netflix/Hystrix
resilience patterns for distributed
systems
https://github.com/Netflix/RxJava
async, ev...
Need well-appointed toolkit
Cassandra
upload script.groovy
for /script
Cassandra
Cassandra
activate /script
upload script.groovy
for /script
Cassandra
Cassandra
activate /script
Cassandra
Scriptable
Backend
load /script
upload script.groovy
for /script
Cassandra
Cassandra
activate /script
Cassandra
Scriptable
Backend
load /script
upload script.groovy
for /script
Scriptable...
Deployment & Ops
Historical & realtime metrics, sort
realtime by error/request rate
Distributed grep + tail
2013-05-09.20:38:54 MX 200 us-east-1c i-1824cb73 i-1c61b77f prod NFPS3-001-8G50FJCX...
28840476938...
Go for haystack handing you
the needle
Or at least be able to make smaller
haystacks
Run 1% of your traffic on the new
code and see how it does
API ami-123 API ami-456
2xx
4xx
5xx
latency
busy threads
load
...
Confidence score for each AMI based on
comparison of 1000+ metrics
Scannable visualization of metric space
More
important
Less
important
Your basic red/black push
Doing red/black by hand for multiple
clusters across multiple regions is
not fun
Automate multi-cluster/region
pushes
Automate multi-cluster/region
pushes
Don't forget to
automate
rollbacks, too!
$Who, $What, $Where, $When
e.g., "bschmaus, ami-123, Sandbox Canary, 2013-05-06 19:05"
Latest prod change in chat topic
Quickly see status of all clusters in a
region
API for APIs
Control network interactions
Device optimization
Development agility
Automation & insights
Continuously experiment to make
hard things easier
bschmaus@netflix.com
@schmaus
Great
Engineers
Wanted!
jobs.netflix.com
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
API World 2013 - Transforming the Netflix API
Upcoming SlideShare
Loading in...5
×

API World 2013 - Transforming the Netflix API

4,857

Published on

The Netflix API has undergone a transformation since its inception in 2008. It has transitioned from being a public API with a generic RESTful interface to a platform for creating highly optimized, device-centric APIs that are critical to delivering the Netflix streaming experience on over 1000 different device types.

This talk covers the design principles that shaped the transformation of the API as well as the technology that powers it, enabling rapid user experience iteration and bringing Netflix streaming to almost 38 million subscribers around the world.

Published in: Technology

API World 2013 - Transforming the Netflix API

  1. 1. Transforming the Netflix API Ben Schmaus, Netflix October 2013, API World bschmaus@netflix.com || @schmaus
  2. 2. Streaming TV Shows & Movies Globally
  3. 3. > 1000 Devices
  4. 4. 1/3 of Internet at peak
  5. 5. Exclusive Content
  6. 6. Almost 38 million subscribers in over 40 countries
  7. 7. Personalization Engine User Info Movie Metadata Ratings Similar Movies Instant Queue A/B Test Engine API
  8. 8. Personalization Engine User Info Movie Metadata Ratings Similar Movies Instant Queue A/B Test Engine API Enable UX Innovation Insulate from Failure
  9. 9. It wasn’t always this way
  10. 10. Foster Innovation from Outside
  11. 11. REST for Easy Interop, Integration RESTful API
  12. 12. REST for Easy Interop, Integration RESTful API Model resources like users, movies, series, ratings, etc
  13. 13. REST for Easy Interop, Integration RESTful API /catalog/titles /catalog/titles/movies /catalog/titles/series /catalog/titles/series/episodes
  14. 14. REST for Easy Interop, Integration RESTful API /users/lists /users/title_states /users/ratings
  15. 15. Enter Streaming Devices
  16. 16. Lots of devices, lots of variety across platforms
  17. 17. External apps
  18. 18. 1 /apps/{app_id}/config 1 /users/{user_id} 1 /users/{user_id}/queues/instant/available 1 /users/{user_id}/lists 26 /catalog/titles?...
  19. 19. Doesn’t include JS, CSS, images, etc. 1 /apps/{app_id}/config 1 /users/{user_id} 1 /users/{user_id}/queues/instant/available 1 /users/{user_id}/lists 26 /catalog/titles?...
  20. 20. Device variance: interaction models
  21. 21. Device variance: interaction models & form factors
  22. 22. A/B Tests Add to the Challenge
  23. 23. “Bolt on” functionality expand=@queue_item@title,@box_art_exists, @short_synopsis,@directors,@cast,@episodes,@formats, @format_availability,@subtitle_languages, @languages_and_audio,@maturity_ratings
  24. 24. Resource-based API wasn’t scaling
  25. 25. Model Resources /catalog, /users Model Experiences /tv/home
  26. 26. Reduce network chattiness Support device optimizations Enable fast dev iterations
  27. 27. GET /users/{user_id}/lists apiGateway.getLists(userId)
  28. 28. Discrete HTTP requests pay WAN tax repeatedly
  29. 29. Single, optimized request; pay WAN tax once
  30. 30. Single, optimized request; pay WAN tax once Client data assembly logic pushed to server
  31. 31. Add server-side scripting capability Enable independent dev and device optimization
  32. 32. API Internet /tv/home
  33. 33. API /tv/home Client Server Internet
  34. 34. API Application Client Server Internet /tv/home
  35. 35. API Client Server Internet UI Teams Application /tv/home
  36. 36. API API Team Client Server Internet UI Teams Application /tv/home
  37. 37. API UI Teams API Team Mid-tier Services Client Server Internet Application /tv/home
  38. 38. RxJava Hystrix API Team JavaServiceLayer Mid-tier Services UI Teams Client Server Internet Application /tv/home
  39. 39. ELB Zuul Mid-tier Services Scriptable Backend Scriptable Backend + API Layer
  40. 40. https://github.com/Netflix/Hystrix resilience patterns for distributed systems https://github.com/Netflix/RxJava async, event-based programming https://github.com/Netflix/zuul dynamic request routing, realtime monitoring, resiliency
  41. 41. Need well-appointed toolkit
  42. 42. Cassandra upload script.groovy for /script
  43. 43. Cassandra Cassandra activate /script upload script.groovy for /script
  44. 44. Cassandra Cassandra activate /script Cassandra Scriptable Backend load /script upload script.groovy for /script
  45. 45. Cassandra Cassandra activate /script Cassandra Scriptable Backend load /script upload script.groovy for /script Scriptable Backend Device Client GET /script
  46. 46. Deployment & Ops
  47. 47. Historical & realtime metrics, sort realtime by error/request rate
  48. 48. Distributed grep + tail 2013-05-09.20:38:54 MX 200 us-east-1c i-1824cb73 i-1c61b77f prod NFPS3-001-8G50FJCX... 288404769389848058 90ms api-global.netflix.com GET /tvui/release/470/plus/pathEvaluator - amazon.ami-id: ami-502eb039 amazon.availability-zone: us-east-1c amazon.instance-id: i-1824cb73 amazon.instance-type: m2.2xlarge amazon.local-ipv4: 10.6.213.112 amazon.public-hostname: ec2-54-243-4-69.compute-1.amazonaws.com amazon.public-ipv4: 54.243.4.69 cookie_esn: NFPS3-001-8G50FJCX... country: MX currentTime: 1368131934468 duration-millis: 90 esn: NFPS3-001-8G50FJCX... geo.city: CIUDADOBREGON ... $ ./simple_stream.py -f -q 'e["country"]=="MX" && e["esn"] ==~/NFPS3.*/' -r us
  49. 49. Go for haystack handing you the needle
  50. 50. Or at least be able to make smaller haystacks
  51. 51. Run 1% of your traffic on the new code and see how it does
  52. 52. API ami-123 API ami-456 2xx 4xx 5xx latency busy threads load ...
  53. 53. Confidence score for each AMI based on comparison of 1000+ metrics
  54. 54. Scannable visualization of metric space More important Less important
  55. 55. Your basic red/black push
  56. 56. Doing red/black by hand for multiple clusters across multiple regions is not fun
  57. 57. Automate multi-cluster/region pushes
  58. 58. Automate multi-cluster/region pushes Don't forget to automate rollbacks, too!
  59. 59. $Who, $What, $Where, $When e.g., "bschmaus, ami-123, Sandbox Canary, 2013-05-06 19:05" Latest prod change in chat topic
  60. 60. Quickly see status of all clusters in a region
  61. 61. API for APIs Control network interactions Device optimization Development agility Automation & insights
  62. 62. Continuously experiment to make hard things easier
  63. 63. bschmaus@netflix.com @schmaus Great Engineers Wanted! jobs.netflix.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×