Netflix API Crash CourseBuilding & Running the API in 30 minutesBen Schmaus, NetflixMay 2013, Glueconbschmaus@netflix.com@...
Streaming TV Shows & Movies Globally
> 1000 Devices
1/3 ofInternetat peak
Programmer not Distributor
More than 36 millionsubscribers in over40 countries
How does the API fitinto the picture?
PersonalizationEngineUser InfoMovieMetadataRatingsSimilarMoviesInstantQueueA/B TestEngineAPI
PersonalizationEngineUser InfoMovieMetadataRatingsSimilarMoviesInstantQueueA/B TestEngineAPIEnable UXInnovationInsulate fr...
> 2 Billion Requests per Day
Growth Over Time
AutomationVisibilityOperational awarenessBalance speed& quality
Hows the APIput together?
ELBRoutingClusterMid-tierServicesBackendAppClusterBackendAppCluster+API Layer
ELBRoutingClusterMid-tierServicesBackendAppClusterBackendAppCluster+API Layer
Inside anAPIApp ServerRxJavaHystrixService Client 1 Service Client 2 Service Client N
HystrixRx+JavaServiceLayerService Client(provided JAR)ApplicationService/device/endpoint(provided script)ServiceUITeamsMid...
Continually changing UI scripts andmid-tier servicesFunctionality, resiliency andperformance drifts over time
Deployment & Ops
REMOVE MANUAL WORK pushingcode to multiple AWS regions/clustersENABLE RAPID DEPLOYMENT ofcode despite limited visibility i...
Tools
End-to-end Traceability UsingPython/Java Glue
Code Flow
Run 1% of your traffic on the newcode and see how it does
API ami-123 API ami-4562xx4xx5xxlatencybusy threadsload...
Manually looking at graphs and SSH-ing into servers and grep-ing logsdoesnt scale(although we used to do that)
Confidence score for each AMI based oncomparison of 1000+ metrics
Scannable visualization of metric spaceMoreimportantLessimportant
Cross-reference Jira, Link to code diffs
Track lib changes
Easy to access report artifacts for each AMI
Your basic red/black push
Doing red/black by hand for multipleclusters across multiple regions isnot fun
Automate multi-cluster/regionpushes
Automate multi-cluster/regionpushesDont forget toautomaterollbacks, too!
$Who, $What, $Where, $Whene.g., "bschmaus, ami-123, Sandbox Canary, 2013-05-06 19:05"Latest prod change in chat topic
Quickly see status of all clusters in aregion
What the #%*! just happened!?
Historical & realtime metrics, sortrealtime by error/request rate
Distributed grep + tail2013-05-09.20:38:54 MX 200 us-east-1c i-1824cb73 i-1c61b77f prodNFPS3-001-8G50FJCX... 2884047693898...
Go for haystack handing youthe needle
Or at least be able to make smallerhaystacks
Continuously experiment to makehard things easier
Even with the best tools, buildingsoftware is hard work.Great engineers build greatsoftware.
Want to help us build the API?bschmaus@netflix.com@schmaus
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Gluecon 2013   netflix api crash course
Upcoming SlideShare
Loading in …5
×

Gluecon 2013 netflix api crash course

2,738 views

Published on

Presentation from Gluecon 2013 on building and running the Netflix API.

Published in: Technology, News & Politics
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,738
On SlideShare
0
From Embeds
0
Number of Embeds
166
Actions
Shares
0
Downloads
29
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide

Gluecon 2013 netflix api crash course

  1. 1. Netflix API Crash CourseBuilding & Running the API in 30 minutesBen Schmaus, NetflixMay 2013, Glueconbschmaus@netflix.com@schmaus
  2. 2. Streaming TV Shows & Movies Globally
  3. 3. > 1000 Devices
  4. 4. 1/3 ofInternetat peak
  5. 5. Programmer not Distributor
  6. 6. More than 36 millionsubscribers in over40 countries
  7. 7. How does the API fitinto the picture?
  8. 8. PersonalizationEngineUser InfoMovieMetadataRatingsSimilarMoviesInstantQueueA/B TestEngineAPI
  9. 9. PersonalizationEngineUser InfoMovieMetadataRatingsSimilarMoviesInstantQueueA/B TestEngineAPIEnable UXInnovationInsulate fromFailure
  10. 10. > 2 Billion Requests per Day
  11. 11. Growth Over Time
  12. 12. AutomationVisibilityOperational awarenessBalance speed& quality
  13. 13. Hows the APIput together?
  14. 14. ELBRoutingClusterMid-tierServicesBackendAppClusterBackendAppCluster+API Layer
  15. 15. ELBRoutingClusterMid-tierServicesBackendAppClusterBackendAppCluster+API Layer
  16. 16. Inside anAPIApp ServerRxJavaHystrixService Client 1 Service Client 2 Service Client N
  17. 17. HystrixRx+JavaServiceLayerService Client(provided JAR)ApplicationService/device/endpoint(provided script)ServiceUITeamsMid-tierServiceTeamsAPITeam
  18. 18. Continually changing UI scripts andmid-tier servicesFunctionality, resiliency andperformance drifts over time
  19. 19. Deployment & Ops
  20. 20. REMOVE MANUAL WORK pushingcode to multiple AWS regions/clustersENABLE RAPID DEPLOYMENT ofcode despite limited visibility into how itschangedKEEP TEAM INFORMED about whatshappening in prodMITIGATE RISK of systemic failure
  21. 21. Tools
  22. 22. End-to-end Traceability UsingPython/Java Glue
  23. 23. Code Flow
  24. 24. Run 1% of your traffic on the newcode and see how it does
  25. 25. API ami-123 API ami-4562xx4xx5xxlatencybusy threadsload...
  26. 26. Manually looking at graphs and SSH-ing into servers and grep-ing logsdoesnt scale(although we used to do that)
  27. 27. Confidence score for each AMI based oncomparison of 1000+ metrics
  28. 28. Scannable visualization of metric spaceMoreimportantLessimportant
  29. 29. Cross-reference Jira, Link to code diffs
  30. 30. Track lib changes
  31. 31. Easy to access report artifacts for each AMI
  32. 32. Your basic red/black push
  33. 33. Doing red/black by hand for multipleclusters across multiple regions isnot fun
  34. 34. Automate multi-cluster/regionpushes
  35. 35. Automate multi-cluster/regionpushesDont forget toautomaterollbacks, too!
  36. 36. $Who, $What, $Where, $Whene.g., "bschmaus, ami-123, Sandbox Canary, 2013-05-06 19:05"Latest prod change in chat topic
  37. 37. Quickly see status of all clusters in aregion
  38. 38. What the #%*! just happened!?
  39. 39. Historical & realtime metrics, sortrealtime by error/request rate
  40. 40. Distributed grep + tail2013-05-09.20:38:54 MX 200 us-east-1c i-1824cb73 i-1c61b77f prodNFPS3-001-8G50FJCX... 288404769389848058 90ms api-global.netflix.com GET /tvui/release/470/plus/pathEvaluator -amazon.ami-id: ami-502eb039amazon.availability-zone: us-east-1camazon.instance-id: i-1824cb73amazon.instance-type: m2.2xlargeamazon.local-ipv4: 10.6.213.112amazon.public-hostname: ec2-54-243-4-69.compute-1.amazonaws.comamazon.public-ipv4: 54.243.4.69cookie_esn: NFPS3-001-8G50FJCX...country: MXcurrentTime: 1368131934468duration-millis: 90esn: NFPS3-001-8G50FJCX...geo.city: CIUDADOBREGON...$ ./simple_stream.py -f -q e["country"]=="MX" && e["esn"]==~/NFPS3.*/ -r us
  41. 41. Go for haystack handing youthe needle
  42. 42. Or at least be able to make smallerhaystacks
  43. 43. Continuously experiment to makehard things easier
  44. 44. Even with the best tools, buildingsoftware is hard work.Great engineers build greatsoftware.
  45. 45. Want to help us build the API?bschmaus@netflix.com@schmaus

×