Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
From Zero to
Production-Ready
in Minutes
Tim Bozarth
@timbozarth
Dev Experience:
Level up your Eng
Effectiveness
Agenda
1. It was the best of times...
2. Best practices made easy
3. Goodbye hand-written clients
4. From NIH to OSS
It was the best of
times…
(ie: The story of the skeletons in our closet)
1
About Netflix..
100m+ members
1000+ developers
190+ countries
1/3 US download traffic
500+ microservices
Over 100,000 VMs
Runtime Platform
Enable developers to productively create and integrate
software in the Netflix ecosystem.
Major Investments in
Platform
High Availability = Winning
Moments of
Truth
High Availability =
Challenges:
Hard to take advantage of
evolving best practices
Owning client-side logic is complex
and stressful
Non-Java e...
Challenges:
Productivity++
(availability is table stakes)
Complexity is the
mind killer.
Runtime Platform
Enable developers to productively create and integrate
software in the Netflix ecosystem.
Best-practices
made easy
(Better living through less complexity)
2
Generators
Generators
What:
Gives you a deployed app
on the “paved road” in
minutes.
Generators
Why:
To make it easy to adopt,
understand, and build
production-ready apps.
+
Best
Practices
Historically:
“Let’s go!”
With Generators:
“Let’s go!”
+
+
+
+
+
=
But wait! There’s
more!
(Consistency)
Components != PaaS
Goodbye
hand-written
client libraries
3
@Netflix
every service owner
is responsible for a
client
Clients defend
themselves from
failure
(and the foundation to much of Netflix’s micro-service success)
Your	service
Their	service
Your	
Client
RPC	Internals
Platform	
Integration
Serialization	&	
Deserialization
Bespoke	business	logic
Your
Client
Your	service
Inclu...
Your	Service
RPC	Internals
Dependencies
RPC	Internals
Platform	
Integration
Serialization	&	
Deserialization
Bespoke	busin...
Problems
Server-API changes are a nightmare
So much hand-written RPC-related
code
No cross-language client story
These are solvable
problems
+
2 big wins:
Code Generation
New Abstraction Layer
PROTO
Your	Service
gRPC generated	interfaces
Depend
encies
gRPC Generated	Client
Bespoke	business	logic (please	no)
Server	Logic...
aching, Circuit-breake
Fallbacks, Failure
Injection, Discovery,
equest-context-tracin
etrics, Retries, Hedge
Interceptors
encapsulate
common patterns
(outside the user’s typical concern domain)
Client Defense Examples:
• Fallbacks
• Advanced Caching
• Retries
• Failure Injection
• Hedged Requests
• Circuit Breakers...
Complex,
multi-tier
caching
took a lot of
code.
(In proto)
(In client
config)
gRPC ❤ languages!
NIH → OSS
4
Value
Effort
Value
Effort
With every step
comes the decision
to take another.
Inertia is a powerful
force, and a terrible
strategy.
Favor commodity
when it’s not our
core competency
(oh right! AWS!)
Wrapping up…
Ω
Everything discussed is done
gRPC = 10%+ of Netflix RPC
800+ projects made with generators
100+ services currently deploye...
Code generation is the short &
long term solution
IDLs = micro-services’ best
friend
Don’t build stuff you don’t
need to
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Upcoming SlideShare
Loading in …5
×

Netflix: From Zero to Production-Ready in Minutes (QCon 2017)

2,803 views

Published on

Slides from Tim Bozarth's (@timbozarth) QCon 2017 presentation (https://qconnewyork.com/ny2017/presentation/zero-production-ready-minutes)

Abstract:
The fabric of Netflix's approach to building new highly-available services is evolving. The Runtime Platform Team is focused on improving developer productivity while simultaneously making it simpler to build and maintain the high-availability services that Netflix expects. Starting with application generation, and leveraging a new approach to communication between services (RPC), we're simplifying what's needed to build a fast, reliable, and optimized service capable of delivering a fantastic customer experience.

We'll be sharing how Netflix is enabling engineers to go from "zero" to "production ready" in minutes - incorporating best-practices learned through years in the cloud. We will also share the story of transitioning from our home-grown RPC machinery to open-source standards, how we recognized when it was the right time to walk away from our own creations, and how our new approach is improving team velocity across Netflix engineering.

Published in: Engineering

Netflix: From Zero to Production-Ready in Minutes (QCon 2017)

  1. 1. From Zero to Production-Ready in Minutes Tim Bozarth @timbozarth
  2. 2. Dev Experience: Level up your Eng Effectiveness
  3. 3. Agenda 1. It was the best of times... 2. Best practices made easy 3. Goodbye hand-written clients 4. From NIH to OSS
  4. 4. It was the best of times… (ie: The story of the skeletons in our closet) 1
  5. 5. About Netflix.. 100m+ members 1000+ developers 190+ countries 1/3 US download traffic 500+ microservices Over 100,000 VMs
  6. 6. Runtime Platform Enable developers to productively create and integrate software in the Netflix ecosystem.
  7. 7. Major Investments in Platform
  8. 8. High Availability = Winning Moments of Truth
  9. 9. High Availability =
  10. 10. Challenges: Hard to take advantage of evolving best practices Owning client-side logic is complex and stressful Non-Java experience is hard
  11. 11. Challenges:
  12. 12. Productivity++ (availability is table stakes)
  13. 13. Complexity is the mind killer.
  14. 14. Runtime Platform Enable developers to productively create and integrate software in the Netflix ecosystem.
  15. 15. Best-practices made easy (Better living through less complexity) 2
  16. 16. Generators
  17. 17. Generators What: Gives you a deployed app on the “paved road” in minutes.
  18. 18. Generators Why: To make it easy to adopt, understand, and build production-ready apps.
  19. 19. + Best Practices
  20. 20. Historically: “Let’s go!”
  21. 21. With Generators: “Let’s go!”
  22. 22. + + + + + =
  23. 23. But wait! There’s more! (Consistency)
  24. 24. Components != PaaS
  25. 25. Goodbye hand-written client libraries 3
  26. 26. @Netflix every service owner is responsible for a client
  27. 27. Clients defend themselves from failure (and the foundation to much of Netflix’s micro-service success)
  28. 28. Your service Their service Your Client
  29. 29. RPC Internals Platform Integration Serialization & Deserialization Bespoke business logic Your Client Your service Includes integration with Metrics, Caching, Discovery, Fallbacks, etc...
  30. 30. Your Service RPC Internals Dependencies RPC Internals Platform Integration Serialization & Deserialization Bespoke business logic Server Logic Platform Integration Serialization & Deserialization Your Client Their Server
  31. 31. Problems Server-API changes are a nightmare So much hand-written RPC-related code No cross-language client story
  32. 32. These are solvable problems
  33. 33. +
  34. 34. 2 big wins: Code Generation New Abstraction Layer
  35. 35. PROTO
  36. 36. Your Service gRPC generated interfaces Depend encies gRPC Generated Client Bespoke business logic (please no) Server Logic Your Client Their Server Service Proto
  37. 37. aching, Circuit-breake Fallbacks, Failure Injection, Discovery, equest-context-tracin etrics, Retries, Hedge
  38. 38. Interceptors encapsulate common patterns (outside the user’s typical concern domain)
  39. 39. Client Defense Examples: • Fallbacks • Advanced Caching • Retries • Failure Injection • Hedged Requests • Circuit Breakers (Hystrix) • Common analytics & event-logs • ... and much more
  40. 40. Complex, multi-tier caching took a lot of code.
  41. 41. (In proto) (In client config)
  42. 42. gRPC ❤ languages!
  43. 43. NIH → OSS 4
  44. 44. Value Effort
  45. 45. Value Effort
  46. 46. With every step comes the decision to take another.
  47. 47. Inertia is a powerful force, and a terrible strategy.
  48. 48. Favor commodity when it’s not our core competency (oh right! AWS!)
  49. 49. Wrapping up… Ω
  50. 50. Everything discussed is done gRPC = 10%+ of Netflix RPC 800+ projects made with generators 100+ services currently deployed from generators This stuff = Default for 6-12 months
  51. 51. Code generation is the short & long term solution IDLs = micro-services’ best friend Don’t build stuff you don’t need to

×