This document discusses using Netflix's microservices stack on AWS. It describes Netflix's architecture of using hundreds of microservices across multiple regions to handle billions of requests per day. It outlines the principles of Netflix's stack including stateless services, auto-scaling, no single points of failure, and designing for failures. Key technologies in Netflix's open source stack are explained like Eureka for service discovery, Ribbon for load balancing, Hystrix for latency and fault tolerance, RxJava for reactive programming, and Dynomite for distributed caching. Chaos engineering practices like fault injection testing are also covered.
5. Why Netflix?
๏ฑ Billions Requests Per Day
๏ฑ 1/3 US internet
bandwidth
๏ฑ ~10k EC2 Instances
๏ฑ Multi-Region
๏ฑ 100s Microservices
๏ฑ Innovation + Solid
Service
๏ฑ SOA, Microservices and
DevOps Benchmark
๏ฑ Social Product
๏ฑ Social Network
๏ฑ Video
๏ฑ Docs
๏ฑ Apps
๏ฑ Chat
๏ฑScalability
๏ฑDistributed Teams
๏ฑCould reach some
๏ฑWeb Scale
Netflix My Problem
8. Principles
๏ฑ Stateless Services
๏ฑ Ephemeral Instances
๏ฑ Everything fails all the
time
๏ฑ Auto Scaling / Down
Scaling
๏ฑ Multi AZ and multi
Region
๏ฑ No SPOF
๏ฑ Design for Failure
(expected)
๏ฑ SOA
๏ฑ Microservices
๏ฑ No Central Database
๏ฑ NoSQL
๏ฑ Lightweight Serializable
Objects
๏ฑ Latency tolerant
protocols
๏ฑ DevOps Enabler
๏ฌ
Immutable Infrastructure
๏ฌ
Anti-Fragility
32. ๏ฑ Reactive Extension of the JVM
๏ฑ Async/Event based programming
๏ฑ Observer Pattern
๏ฑ Less 1mb
๏ฑ Heavy usage by Netflix OSS Stack
RX-Java
33. Archaius
๏ฑ Configuration Management Solution
๏ฑ Dynamic and Typed Properties
๏ฑ High Throughtput and Thread Safety
๏ฑ Callbacks: Notifications of config changes
๏ฑ JMX Beans
๏ฑ Dynamic Config Sources: File, Db, DynamoDB, Zookeper
๏ฑ Based on Apache Commons Configuration
34. Archaius + Git
MicroserviceMicroservice Slave Side Car
Central
Internal GIT
Property
Files
File
System
MicroserviceMicroservice Slave Side Car
File
System
MicroserviceMicroservice Slave Side Car
File
System
39. Dynomite
๏ฑ Implements the Amazon Dynamo
๏ฑSimilar to Cassandra, Riak and DynamoDB
๏ฑStrong Consistency โ Quorum-like โ No Data Loss
๏ฑPluggable
๏ฑScalable
๏ฑRedis / Memcached
๏ฑMulti-Clients with Dyno
๏ฑCan use most of redis commands
๏ฑIntegrated with Eureka via Prana
49. Chaos Results and Learnings
๏ฑ Retry configuration and Timeouts in Ribbon
๏ฑ Right Class in Zuul 1.x (default retry only SocketException)
๏ฎ
RequestSpecificRetryHandler (Httpclient Exceptions)
๏ฎ
zuul.client.ribbon.MaxAutoRetries=1
๏ฎ
zuul.client.ribbon.MaxAutoRetriesNextServer=1
๏ฎ
zuul.client.ribbon.OkToRetryOnAllOperations=true
Eureka Timeouts
๏ฑ It Works
๏ฑ Everything needs to have redudancy
๏ฑ ASG is your friend :-)
๏ฑ Stateless Service FTW
51. Chaos Results and Learnings
๏ฑBefore:
๏ฌ
Data was not in Elastic Search
๏ฌ
Producers was loosing data
๏ฑ
๏ฑAfter:
๏ฌ
No Data Loss
๏ฌ
It Works
๏ฌ
๏ฑChanges:
๏ฌ
No logging on Microservice :( (Log was added)
๏ฌ
Code that publish events on a try-catch
๏ฌ
Retry config in kafka producer from 0 to 5