How Netflix does Microservices ...
Manuel Correa
Microservices
“Small Autonomous Services
that Work Together”
Sam Newman
Microservices
“Conway’s Law”
“Any organization that designs a system (defined broadly) will produce a design
whose structure is a copy of the organization's communication structure.”
Microservices Principles
http://www.slideshare.net/spnewman/principles-of-microservices-ndc-2014
Modeled around Business Domain Culture of Automation Hide Implementation
Decentralize All Things
Design for Failure
Highly Observable
Deploy Independently
MicroServices
Culture of Automation
- Immutable infrastructure in
AWS
Decentralize All Things
Hide Implementation Details
- Routing
- Contracts
- Resiliancy
- Discovery
- How services work togetherNodeJS
Ruby
Clojure
Free for all
Agree
Decentralize All Things
Smart Endpoints and Dumb Pipes
- Dynamic Routing
- Gateway for all Netflix services
- Pluggable system that takes care of:
- Authorization and Authentication
- Monitoring and tracking request
- Load shedding
- First level of resilience
- Enables caching in the gateway level
Decentralize All Things
Smart Endpoints and Dumb Pipes
Decentralize All things
Service Discovery
- Service Registry
- Middle tier load balancing
- Carries metadata of each service
- Dynamic Service repository
Decentralize All Things
Dynamic Configuration
- Dynamic Typed Properties = Feature Flag System
- Allow you to change properties on Runtime
- Polling framework
- Multiple sources (i.e.: Cassandra and DynamoDB)
- Callbacks when the property changes
CB’s Zuul is using Archaius to change properties across AWS regions, HttpClient configurations
and logging level
Design for Failure
- HTTP library
- Load balancing on the client side
- Retrys built-in
- Caching
- Request batching
Design for Failure
- Java Resilience library
- Stop cascading failures
- Fallback and gracefully degrade when possible
- Realtime monitoring
- Circuit breaker pattern
Design for Failure
● No Service has 100% SLA
● 99.9930
= 99.7% uptime
● 0.3% of 1 billion requests = 300,000 failures
● 2+ hours downtime/month even if all
dependencies have excellent uptime.
Service1
Service2
Service3 Fallback
Design for Failure
Circuit Breaker pattern
Design for Failure
Hystrix Dashboard
Decentralized Architecture
Demo
May the demo Gods be with us...
/service/jobs
Client
/service/resumes
:9292
:9292
Demo
May the demo Gods be with us...
SERVICE
Client
Zuul
Hystrix
Ribbon
Fallback
Backup Service
Fallback
Cache
:9090
:9292
:9393
Design for Failure
- Testing resiliency in Production
- Chaos Monkey => Kill instances randomly
- Latency Monkey => Induce latency in services
- Chaos Gorilla => Simulates AZ and regions down
- Conformity Monkey => Make sure instances follow good
practices
Highly Observable
- Hystrix Stream aggregator
- AWS Change Tracker
- AWS Usage Tracker
Take Aways
http://www.slideshare.net/spnewman/principles-of-microservices-ndc-2014
Modeled around Business Domain Culture of Automation Hide Implementation
Decentralize All Things
Design for Failure
Highly Observable
Deploy Independently
MicroServices
Take Aways
- Each Service must have a fallback strategy by
design
- Routing layer is essential for the architecture
- To make Services work together, there is a
need for a highly reliable infrastructure around
the MicroServices
Take Aways
“Conway’s Law”
“Any organization that designs a system (defined broadly) will produce a design
whose structure is a copy of the organization's communication structure.”
References
- http://www.slideshare.net/spnewman/practical-microservices-ndc-2014
- http://www.slideshare.net/spnewman/principles-of-microservices-ndc-
2014?related=1
- http://netflix.github.io/
- http://martinfowler.com/articles/microservices.html
Questions?
manuel.correa@careerbuilder.com
Hipchat: Manny Correa

How Netflix does Microservices