SlideShare a Scribd company logo
A NETFLIX ORIGINAL SERVICE
Peter Bakas | @peter_bakas
@ Netflix : Cloud Platform Engineering - Real Time Data Infrastructure
@ Ooyala : Analytics, Discovery, Platform Engineering & Infrastructure
@ Yahoo : Display Advertising, Behavioral Targeting, Payments
@ PayPal : Site Engineering and Architecture
@ Play : Advisor to Startups (Data, Security, Containers)
Who is this guy?
common data pipeline to collect, transport, aggregate, process and visualize events
Why are we here?
● Architectural design and principles for Keystone
● Technologies that Keystone is leveraging
● Best practices
What should I expect?
Let’s get down to business
Netflix is a logging company
that occasionally streams movies
600+ billion events ingested per day
11 million events (24 GB per second) peak
Hundreds of event types
Over 1.3 Petabyte / day
Numbers Galore!
But wait, there’s more
1+ trillion events processed every day
1 trillion events ingested per day during holiday season
Numbers Galore - Part Deux
How did we get here?
Chukwa
Chukwa/Suro + Real-Time Branch
Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Event
Producer
Consumer
Kafka
Control Plane
HTTP
PROXY
Kafka Primer
Kafka is a distributed, partitioned, replicated commit log service.
Kafka Terminology
● Producer
● Consumer
● Topic
● Partition
● Broker
Netflix Kafka Producer
● Best effort delivery
● Prefer msg drop than disrupting producer app
● Wraps Apache Kafka Producer
● Integration with Netflix ecosystem: Eureka, Atlas, etc.
Producer Impact
● Kafka outage does not disrupt existing instances from serving its purpose
● Kafka outage should never prevent new instances from starting up
● After kafka cluster restored, event producing should resume automatically
Prefer Drop than Block
● Drop when buffer is full
● Handle potential blocking of first meta data request
● ack=1 (vs 2)
Sticky Partitioner
● Batching is important to reduce CPU and network I/O on brokers
● Stick to one partition for a while when producing for non-keyed messages
● “linger.ms” works well with sticky partitioner
Producing events to Keystone
● Using Netflix Platform logging API
○ LogManager.logEvent(Annotatable): majority of the cases
○ KeyValueSeriazlier with ILog#log(String)
● REST endpoint that proxies Platform logging
○ ksproxy
○ Prana sidecar
Injected Event Metadata
● GUID
● Timestamp
● Host
● App
Keystone Extensible Wire Protocol
● Invisible to source & sinks
● Backwards and forwards compatibility
● Supports JSON. AVRO on the horizon
● Efficient - 10 bytes overhead per message
○ message size - hundreds of bytes to 10MB
Keystone Extensible Wire Protocol
● Packaged as a jar
● Why? Evolve Independently
○ event metadata & traceability metadata
○ event payload serialization
Max message size 10MB
● Keystone drops if > 10MB
○ Immutable event payload
Fronting Kafka Clusters
Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Event
Producer
Consumer
Kafka
Control Plane
Fronting Kafka Clusters
● Normal-priority (majority)
● High-priority (streaming activities etc.)
Fronting Kafka Instances
● 3 ASGs per cluster, 1 ASG per zone
● 3000 d2.xl AWS instances across 3 regions for regular & failover traffic
Partition Assignment
● All replica assignments zone aware
○ Improved availability
○ Reduce cost of maintenance
Kafka Fault Tolerance
● Instance failure
○ With replication factor of N, guarantee no data loss with N-1 failures
○ With zone aware replica assignment, guarantee no data loss with multiple instance failures in the same
zone
● Sink failure
○ No data loss during retention period
● Replication is the key
○ Data loss can happen if leader dies while follower AND consumer cannot catch up
○ Usually indicated by UncleanLeaderElection metric
Kafka Auditor as a Service
● Broker monitoring
● Consumer monitoring
● Heart-beat & Continuous message latency
● On-demand Broker performance testing
● Built as a service deployable on single or multiple instances
Current Issues
● By using the d2-xl there is trade off between cost and performance
● Performance deteriorates with increase of partitions
● Replication lag during peak traffic
Routing Service
Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Event
Producer
Consumer
Kafka
Control Plane
Broker
Routing Infrastructure
+
Checkpointing
Cluster
+ 0.9.1
Router
Job Manager
(Control Plane)
EC2 Instances
Zookeeper
(Instance Id assignment)
Job
Job
Job
ksnode
Checkpointing
Cluster
ASG
Reconcile every min.
Routing Layer
● Total of 13,000 containers on 1,300 AWS C3-4XL instances
○ S3 sink: ~7000 Containers
○ Consumer Kafka sink: ~ 4500 Containers
○ Elasticsearch sink: ~1500 Containers
Routing Layer
● Total of ~1400 streams across all regions
○ ~1000 S3 streams
○ ~250 Consumer Kafka streams
○ ~150 Elasticsearch streams
Router Job Details
● One Job per sink and Kafka source topic
○ Separate Job each for S3, ElasticSearch & Kafka sink
○ Provides better isolation & better QOS
● Batch processed message requests to sinks
● Offset checkpointed after batch request succeeds
Processing Semantics
Data Loss & Duplicates
Backpressure
Producer ⇐ Kafka Cluster ⇐ Samza job router ⇐ Sink
● Keystone - at least once
Data Loss - Producer
● buffer full
● network error
● partition leader change
● partition migration
Data Loss - Kafka
● Lose all Kafka replicas of data
○ Safe guards:
■ AZ isolation / Alerts / Broker replacement automation
■ alerts and monitoring
● Unclean partition leader election
○ ack = 1 could cause loss
Data Loss - Router
● Lose checkpointed offset & the router was down for retention period duration
● If messages not processed past retention period (8h / 24h)
● Unclean leader election cause offset to go back
● Safe guard
○ alerts for lag > 0.1% of traffic for 10 minutes
● Concerned only if unable to launch router instances
Duplicates Router - Sink
● Duplicates possible
○ messages reprocessed - retry after batch S3 upload failure
○ Loss of checkpointed offset (message processed marker)
○ Event GUID helps dedup
Measure Duplicates
● Producer sent count diff with Kafka message received
● Router checkpointed offset monitored over time
Note: GUID can be used to dedup at the sink
End to End metrics
● Producer to Router to Sink Average Latencies
○ Batch processing [S3 sink]: ~3 sec
○ Stream processing [Consumer Kafka sink]: ~1 sec
○ Log analysis [Elasticsearch]: ~400 seconds (with back pressure)
End to End metrics
● End to End latencies
○ S3:
■ 50 percentile under 1 sec
■ 80 percentile under 8 seconds
○ Consumer Kafka:
■ 50 percentile under 800 ms
■ 80 percentile under 4 seconds
○ Elasticsearch:
■ 50 percentile under 13 seconds
■ 80 percentile under 53 seconds
Alerts
● Producer drop rate over 1%
● Consumer lag > 0.1%
● Next offset after Checkpointed offset not found
● Consumer stuck on partition level
Keystone Dashboard
Keystone Dashboard
Keystone Dashboard
Keystone Dashboard
And Then???
There’s more in the pipeline...
● Self service tools
● Better management of scaling Kafka
● More capable control plane
● JSON Support exists, support for Avro on the horizon
● Multi-tenant Messaging as a Service - MaaS
● Multi-tenant Stream Processing as a Service - SPaaS
???s

More Related Content

What's hot

Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
confluent
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloud
confluent
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Works
confluent
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
HostedbyConfluent
 
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
confluent
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Allen (Xiaozhong) Wang
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
confluent
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
confluent
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com
confluent
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
Adam Kotwasinski
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Slim Baltagi
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward
 
Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center   Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center
confluent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Flink Forward
 

What's hot (20)

Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloud
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Works
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
 
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
Troubleshooting as Your Kafka Clusters Grow (Krunal Vora, Tinder) Kafka Summi...
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center   Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
 

Viewers also liked

(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Amazon Web Services
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini
 
Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016
Michael Kehoe
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
Danny Yuan
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
Monal Daxini
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
Brendan Gregg
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Brendan Gregg
 

Viewers also liked (9)

(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
Imprimante 3D_Créer un objet simple avec tinkercad
Imprimante 3D_Créer un objet simple  avec tinkercadImprimante 3D_Créer un objet simple  avec tinkercad
Imprimante 3D_Créer un objet simple avec tinkercad
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 
Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016Couchbase Meetup Jan 2016
Couchbase Meetup Jan 2016
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 

Similar to Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec

Monal Daxini - Beaming Flink to the Cloud @ Netflix
Monal Daxini - Beaming Flink to the Cloud @ NetflixMonal Daxini - Beaming Flink to the Cloud @ Netflix
Monal Daxini - Beaming Flink to the Cloud @ Netflix
Flink Forward
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Monal Daxini
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016
Peter Bakas
 
Netflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipeline
Monal Daxini
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Steven Wu
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Ankur Bansal
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
Allen (Xiaozhong) Wang
 
Uber Real Time Data Analytics
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data Analytics
Ankur Bansal
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Monal Daxini
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
Amazon Web Services
 
BDX 2016- Monal daxini @ Netflix
BDX 2016-  Monal daxini  @ NetflixBDX 2016-  Monal daxini  @ Netflix
BDX 2016- Monal daxini @ Netflix
Ido Shilon
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
mattlieber
 
Keystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architectureKeystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architecture
Zhenzhong Xu
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
Monal Daxini
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
Scott Mansfield
 
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At ScaleRunning a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
Zhenzhong Xu
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Mingmin Chen
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Timothy Spann
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
Twitter Developers
 

Similar to Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec (20)

Monal Daxini - Beaming Flink to the Cloud @ Netflix
Monal Daxini - Beaming Flink to the Cloud @ NetflixMonal Daxini - Beaming Flink to the Cloud @ Netflix
Monal Daxini - Beaming Flink to the Cloud @ Netflix
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016
 
Netflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipeline
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
Uber Real Time Data Analytics
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data Analytics
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
BDX 2016- Monal daxini @ Netflix
BDX 2016-  Monal daxini  @ NetflixBDX 2016-  Monal daxini  @ Netflix
BDX 2016- Monal daxini @ Netflix
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Keystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architectureKeystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architecture
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At ScaleRunning a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 

Recently uploaded

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 

Recently uploaded (20)

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 

Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec

  • 2. Peter Bakas | @peter_bakas @ Netflix : Cloud Platform Engineering - Real Time Data Infrastructure @ Ooyala : Analytics, Discovery, Platform Engineering & Infrastructure @ Yahoo : Display Advertising, Behavioral Targeting, Payments @ PayPal : Site Engineering and Architecture @ Play : Advisor to Startups (Data, Security, Containers) Who is this guy?
  • 3. common data pipeline to collect, transport, aggregate, process and visualize events Why are we here?
  • 4. ● Architectural design and principles for Keystone ● Technologies that Keystone is leveraging ● Best practices What should I expect?
  • 5. Let’s get down to business
  • 6. Netflix is a logging company
  • 8. 600+ billion events ingested per day 11 million events (24 GB per second) peak Hundreds of event types Over 1.3 Petabyte / day Numbers Galore!
  • 10. 1+ trillion events processed every day 1 trillion events ingested per day during holiday season Numbers Galore - Part Deux
  • 11. How did we get here?
  • 15. Kafka Primer Kafka is a distributed, partitioned, replicated commit log service.
  • 16. Kafka Terminology ● Producer ● Consumer ● Topic ● Partition ● Broker
  • 17.
  • 18. Netflix Kafka Producer ● Best effort delivery ● Prefer msg drop than disrupting producer app ● Wraps Apache Kafka Producer ● Integration with Netflix ecosystem: Eureka, Atlas, etc.
  • 19. Producer Impact ● Kafka outage does not disrupt existing instances from serving its purpose ● Kafka outage should never prevent new instances from starting up ● After kafka cluster restored, event producing should resume automatically
  • 20. Prefer Drop than Block ● Drop when buffer is full ● Handle potential blocking of first meta data request ● ack=1 (vs 2)
  • 21. Sticky Partitioner ● Batching is important to reduce CPU and network I/O on brokers ● Stick to one partition for a while when producing for non-keyed messages ● “linger.ms” works well with sticky partitioner
  • 22. Producing events to Keystone ● Using Netflix Platform logging API ○ LogManager.logEvent(Annotatable): majority of the cases ○ KeyValueSeriazlier with ILog#log(String) ● REST endpoint that proxies Platform logging ○ ksproxy ○ Prana sidecar
  • 23. Injected Event Metadata ● GUID ● Timestamp ● Host ● App
  • 24. Keystone Extensible Wire Protocol ● Invisible to source & sinks ● Backwards and forwards compatibility ● Supports JSON. AVRO on the horizon ● Efficient - 10 bytes overhead per message ○ message size - hundreds of bytes to 10MB
  • 25. Keystone Extensible Wire Protocol ● Packaged as a jar ● Why? Evolve Independently ○ event metadata & traceability metadata ○ event payload serialization
  • 26. Max message size 10MB ● Keystone drops if > 10MB ○ Immutable event payload
  • 29. Fronting Kafka Clusters ● Normal-priority (majority) ● High-priority (streaming activities etc.)
  • 30. Fronting Kafka Instances ● 3 ASGs per cluster, 1 ASG per zone ● 3000 d2.xl AWS instances across 3 regions for regular & failover traffic
  • 31. Partition Assignment ● All replica assignments zone aware ○ Improved availability ○ Reduce cost of maintenance
  • 32. Kafka Fault Tolerance ● Instance failure ○ With replication factor of N, guarantee no data loss with N-1 failures ○ With zone aware replica assignment, guarantee no data loss with multiple instance failures in the same zone ● Sink failure ○ No data loss during retention period ● Replication is the key ○ Data loss can happen if leader dies while follower AND consumer cannot catch up ○ Usually indicated by UncleanLeaderElection metric
  • 33. Kafka Auditor as a Service ● Broker monitoring ● Consumer monitoring ● Heart-beat & Continuous message latency ● On-demand Broker performance testing ● Built as a service deployable on single or multiple instances
  • 34. Current Issues ● By using the d2-xl there is trade off between cost and performance ● Performance deteriorates with increase of partitions ● Replication lag during peak traffic
  • 37.
  • 40. Router Job Manager (Control Plane) EC2 Instances Zookeeper (Instance Id assignment) Job Job Job ksnode Checkpointing Cluster ASG Reconcile every min.
  • 41. Routing Layer ● Total of 13,000 containers on 1,300 AWS C3-4XL instances ○ S3 sink: ~7000 Containers ○ Consumer Kafka sink: ~ 4500 Containers ○ Elasticsearch sink: ~1500 Containers
  • 42. Routing Layer ● Total of ~1400 streams across all regions ○ ~1000 S3 streams ○ ~250 Consumer Kafka streams ○ ~150 Elasticsearch streams
  • 43. Router Job Details ● One Job per sink and Kafka source topic ○ Separate Job each for S3, ElasticSearch & Kafka sink ○ Provides better isolation & better QOS ● Batch processed message requests to sinks ● Offset checkpointed after batch request succeeds
  • 45. Backpressure Producer ⇐ Kafka Cluster ⇐ Samza job router ⇐ Sink ● Keystone - at least once
  • 46. Data Loss - Producer ● buffer full ● network error ● partition leader change ● partition migration
  • 47. Data Loss - Kafka ● Lose all Kafka replicas of data ○ Safe guards: ■ AZ isolation / Alerts / Broker replacement automation ■ alerts and monitoring ● Unclean partition leader election ○ ack = 1 could cause loss
  • 48. Data Loss - Router ● Lose checkpointed offset & the router was down for retention period duration ● If messages not processed past retention period (8h / 24h) ● Unclean leader election cause offset to go back ● Safe guard ○ alerts for lag > 0.1% of traffic for 10 minutes ● Concerned only if unable to launch router instances
  • 49. Duplicates Router - Sink ● Duplicates possible ○ messages reprocessed - retry after batch S3 upload failure ○ Loss of checkpointed offset (message processed marker) ○ Event GUID helps dedup
  • 50. Measure Duplicates ● Producer sent count diff with Kafka message received ● Router checkpointed offset monitored over time Note: GUID can be used to dedup at the sink
  • 51.
  • 52. End to End metrics ● Producer to Router to Sink Average Latencies ○ Batch processing [S3 sink]: ~3 sec ○ Stream processing [Consumer Kafka sink]: ~1 sec ○ Log analysis [Elasticsearch]: ~400 seconds (with back pressure)
  • 53. End to End metrics ● End to End latencies ○ S3: ■ 50 percentile under 1 sec ■ 80 percentile under 8 seconds ○ Consumer Kafka: ■ 50 percentile under 800 ms ■ 80 percentile under 4 seconds ○ Elasticsearch: ■ 50 percentile under 13 seconds ■ 80 percentile under 53 seconds
  • 54. Alerts ● Producer drop rate over 1% ● Consumer lag > 0.1% ● Next offset after Checkpointed offset not found ● Consumer stuck on partition level
  • 60. There’s more in the pipeline... ● Self service tools ● Better management of scaling Kafka ● More capable control plane ● JSON Support exists, support for Avro on the horizon ● Multi-tenant Messaging as a Service - MaaS ● Multi-tenant Stream Processing as a Service - SPaaS
  • 61. ???s