SlideShare a Scribd company logo
1 of 27
©2016 LinkedIn Corporation. All Rights Reserved.
Kafka 0.9, Things you should know
©2016 LinkedIn Corporation. All Rights Reserved.
Ratish Ravindran
Site Reliability Engineer
LinkedIn, Data Infrastructure Streaming
©2016 LinkedIn Corporation. All Rights Reserved. 3
Agenda
 Security
 Kafka Connect
 User defined quota
 New consumer
 Notable improvements and fixes
 Upgrading from kafka 0.8
 Kafka 0.10 - highlights
©2016 LinkedIn Corporation. All Rights Reserved. 4
Security
 Why ?
 Multitenant cluster
 Multiple clusters
 Network ACLs
©2016 LinkedIn Corporation. All Rights Reserved. 5
Security
 Authentication
 Kerberos
 TLS
 Unix like permission
 "Principal P is [Allowed/Denied] Operation O From Host H On Resource R"
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --
allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 --
allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic
 Encryption
©2016 LinkedIn Corporation. All Rights Reserved. 6
Kafka Connect
 Why ?
 Multiple tools for importing and exporting
 High engineering and operational overhead
 Some tools are poor fit for job
©2016 LinkedIn Corporation. All Rights Reserved. 7
Kafka Connect
Kafka
Data
source
C
Data
source
B
Data
source
A
Data
sink 3
Data
sink 2
Data
sink 1T1
T2
T3 T6
T5
T4
©2016 LinkedIn Corporation. All Rights Reserved. 8
Kafka Connect
Kafka
Data
source
C
Data
source
B
Data
source
A
Data
sink 3
Data
sink 2
Data
sink 1
KafkaConnect
KafkaConnect
©2016 LinkedIn Corporation. All Rights Reserved. 9
Kafka Connect
Key Properties:
 Broad copying by default
 Streaming and batch
 Scales to application
 Focus on copying data only
 Parallel
 Connector API
©2016 LinkedIn Corporation. All Rights Reserved. 10
Kafka Connect
Advantages :
 Fault tolerance
 Partitioning
 Offset management
 Delivery semantics
 Operations
 Monitoring
©2016 LinkedIn Corporation. All Rights Reserved. 11
User defined quota
 Why ?
 High reads
 High writes
 SLAs
©2016 LinkedIn Corporation. All Rights Reserved. 12
User defined quota
 Single large cluster
 Producer side (quota.producer.default)
 Consumer side (quota.consumer.default)
 Per client , Per broker
 Quota override
./bin/kafka-config.sh --alter
--add-config ‘producer_byte_rate=1048576,consumer_byte_rate=1048576’
--entity-type clients
--entity-name TestTopic
--zookeeper localhost:2181
©2016 LinkedIn Corporation. All Rights Reserved. 13
New consumer
Motivation :
 Thin consumer client
 Central co-ordination
 Allow manual partition assignment
 Allow manual offset management
 Invocation of user specified callback on rebalance
 Non blocking consumer APIs
©2016 LinkedIn Corporation. All Rights Reserved. 14
New consumer
Features:
 Group management protocol
 Consumer
©2016 LinkedIn Corporation. All Rights Reserved. 15
New consumer
State Diagram - consumer
©2016 LinkedIn Corporation. All Rights Reserved. 16
New consumer
Features:
 Group management protocol
 Consumer
 Co-ordinator
©2016 LinkedIn Corporation. All Rights Reserved. 17
New consumer
State Diagram – Co-ordinator
©2016 LinkedIn Corporation. All Rights Reserved. 18
New consumer
Features:
 Group management protocol
 Consumer
 Co-ordinator
 Failure detection protocol
©2016 LinkedIn Corporation. All Rights Reserved. 19
New consumer
Interesting scenarios:
 Co-ordinator failover/connection loss
 Partition changes for subscribed topics
 Offset commit during rebalance
 Hearbeats during rebalance
 Slow consumers
©2016 LinkedIn Corporation. All Rights Reserved. 20
Notable improvements and fixes
 Automated replica lag tuning (replica.lag.time.max.ms)
 New purgatory design – low memory overhead
 Auto-assign node ids
 No data loss in Mirror Maker – unclean shutdown
 Log compaction for compressed topics
 Handling of corrupt index files
©2016 LinkedIn Corporation. All Rights Reserved. 21
Upgrading from kafka 0.8
 inter.broker.protocol.version=0.8.2.x
 Update code and restart
 inter.broker.protocol.version=0.9.0.0
 Restart brokers again
©2016 LinkedIn Corporation. All Rights Reserved. 22
Potential Breaking Changes:
 Java 1.6 and Scala 2.9 are not supported
 Broker IDs > 1000 ( reserved.broker.max.id and
broker.id.generation.enable )
 replica.lag.max.messages removed
 replica.lag.time.max.ms
 No compaction for topics without key
Upgrading from kafka 0.8
©2016 LinkedIn Corporation. All Rights Reserved. 23
Potential Breaking Changes contd….
 Changes in default JVM options
Upgrading from kafka 0.8
©2016 LinkedIn Corporation. All Rights Reserved. 24
Why not kafka 0.10 ?
©2016 LinkedIn Corporation. All Rights Reserved. 25
Kafka 0.10 - Highlights
 Kafka Streams
 Rack Awareness
 Timestamps in messages
 SASL improvements
 Kafka consumer max record
 Protocol version improvements
©2016 LinkedIn Corporation. All Rights Reserved. 26
References
 http://kafka.apache.org/090/documentation.html
 http://www.confluent.io/blog/apache-kafka-0.9-is-released
 http://www.confluent.io/blog/announcing-apache-kafka-0.10-and-confluent-
platform-3.0
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+R
ewrite+Design
©2016 LinkedIn Corporation. All Rights Reserved.

More Related Content

What's hot

What's hot (20)

Tuning Kafka for Fun and Profit
Tuning Kafka for Fun and ProfitTuning Kafka for Fun and Profit
Tuning Kafka for Fun and Profit
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
 
Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...
Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...
Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...
 
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentMaking Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
 
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
 
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
 
Introducing Confluent Cloud: Apache Kafka as a Service
Introducing Confluent Cloud: Apache Kafka as a Service Introducing Confluent Cloud: Apache Kafka as a Service
Introducing Confluent Cloud: Apache Kafka as a Service
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
 
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
 
Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Microservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka EcosystemMicroservices in the Apache Kafka Ecosystem
Microservices in the Apache Kafka Ecosystem
 
Redis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns SimplifiedRedis and Kafka - Advanced Microservices Design Patterns Simplified
Redis and Kafka - Advanced Microservices Design Patterns Simplified
 
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
 
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
 
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillDelivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
 

Viewers also liked

Embedded Mirror Maker
Embedded Mirror MakerEmbedded Mirror Maker
Embedded Mirror Maker
Simon Suo
 
Kerberos Authentication Protocol
Kerberos Authentication ProtocolKerberos Authentication Protocol
Kerberos Authentication Protocol
Bibek Subedi
 

Viewers also liked (20)

101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)
 
Presentation1
Presentation1Presentation1
Presentation1
 
Embedded Mirror Maker
Embedded Mirror MakerEmbedded Mirror Maker
Embedded Mirror Maker
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Apache Kafka Security
Apache Kafka Security Apache Kafka Security
Apache Kafka Security
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Reliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at AirbnbReliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at Airbnb
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
Kerberos Authentication Protocol
Kerberos Authentication ProtocolKerberos Authentication Protocol
Kerberos Authentication Protocol
 
No data loss pipeline with apache kafka
No data loss pipeline with apache kafkaNo data loss pipeline with apache kafka
No data loss pipeline with apache kafka
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and Kerberos
 
Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
 

Similar to Kafka 0.9, Things you should know

Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
Nitin Kumar
 

Similar to Kafka 0.9, Things you should know (20)

Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
BDW Chicago 2016 - Jayesh Thakrar, Sr. Software Engineer, Conversant - Data...
BDW Chicago 2016 -  Jayesh Thakrar, Sr. Software Engineer, Conversant -  Data...BDW Chicago 2016 -  Jayesh Thakrar, Sr. Software Engineer, Conversant -  Data...
BDW Chicago 2016 - Jayesh Thakrar, Sr. Software Engineer, Conversant - Data...
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
 
HP: Implementácia cloudu s HP
HP: Implementácia cloudu s HPHP: Implementácia cloudu s HP
HP: Implementácia cloudu s HP
 
Lenovo Converged HX Series Nutanix Appliance
Lenovo Converged HX Series Nutanix ApplianceLenovo Converged HX Series Nutanix Appliance
Lenovo Converged HX Series Nutanix Appliance
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaAvoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise control
 
Running Kubernetes with Amazon EKS - AWS Online Tech Talks
Running Kubernetes with Amazon EKS - AWS Online Tech TalksRunning Kubernetes with Amazon EKS - AWS Online Tech Talks
Running Kubernetes with Amazon EKS - AWS Online Tech Talks
 
Working with PowerVC via its REST APIs
Working with PowerVC via its REST APIsWorking with PowerVC via its REST APIs
Working with PowerVC via its REST APIs
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
 
Couchbase and Apache Spark
Couchbase and Apache SparkCouchbase and Apache Spark
Couchbase and Apache Spark
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
 
#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout
 
HP Virtual Connect technical fundamental101 v2.1
HP Virtual Connect technical fundamental101   v2.1HP Virtual Connect technical fundamental101   v2.1
HP Virtual Connect technical fundamental101 v2.1
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 
Successful Patterns for running platforms
Successful Patterns for running platformsSuccessful Patterns for running platforms
Successful Patterns for running platforms
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 

Kafka 0.9, Things you should know

  • 1. ©2016 LinkedIn Corporation. All Rights Reserved. Kafka 0.9, Things you should know
  • 2. ©2016 LinkedIn Corporation. All Rights Reserved. Ratish Ravindran Site Reliability Engineer LinkedIn, Data Infrastructure Streaming
  • 3. ©2016 LinkedIn Corporation. All Rights Reserved. 3 Agenda  Security  Kafka Connect  User defined quota  New consumer  Notable improvements and fixes  Upgrading from kafka 0.8  Kafka 0.10 - highlights
  • 4. ©2016 LinkedIn Corporation. All Rights Reserved. 4 Security  Why ?  Multitenant cluster  Multiple clusters  Network ACLs
  • 5. ©2016 LinkedIn Corporation. All Rights Reserved. 5 Security  Authentication  Kerberos  TLS  Unix like permission  "Principal P is [Allowed/Denied] Operation O From Host H On Resource R" bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add -- allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 -- allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic  Encryption
  • 6. ©2016 LinkedIn Corporation. All Rights Reserved. 6 Kafka Connect  Why ?  Multiple tools for importing and exporting  High engineering and operational overhead  Some tools are poor fit for job
  • 7. ©2016 LinkedIn Corporation. All Rights Reserved. 7 Kafka Connect Kafka Data source C Data source B Data source A Data sink 3 Data sink 2 Data sink 1T1 T2 T3 T6 T5 T4
  • 8. ©2016 LinkedIn Corporation. All Rights Reserved. 8 Kafka Connect Kafka Data source C Data source B Data source A Data sink 3 Data sink 2 Data sink 1 KafkaConnect KafkaConnect
  • 9. ©2016 LinkedIn Corporation. All Rights Reserved. 9 Kafka Connect Key Properties:  Broad copying by default  Streaming and batch  Scales to application  Focus on copying data only  Parallel  Connector API
  • 10. ©2016 LinkedIn Corporation. All Rights Reserved. 10 Kafka Connect Advantages :  Fault tolerance  Partitioning  Offset management  Delivery semantics  Operations  Monitoring
  • 11. ©2016 LinkedIn Corporation. All Rights Reserved. 11 User defined quota  Why ?  High reads  High writes  SLAs
  • 12. ©2016 LinkedIn Corporation. All Rights Reserved. 12 User defined quota  Single large cluster  Producer side (quota.producer.default)  Consumer side (quota.consumer.default)  Per client , Per broker  Quota override ./bin/kafka-config.sh --alter --add-config ‘producer_byte_rate=1048576,consumer_byte_rate=1048576’ --entity-type clients --entity-name TestTopic --zookeeper localhost:2181
  • 13. ©2016 LinkedIn Corporation. All Rights Reserved. 13 New consumer Motivation :  Thin consumer client  Central co-ordination  Allow manual partition assignment  Allow manual offset management  Invocation of user specified callback on rebalance  Non blocking consumer APIs
  • 14. ©2016 LinkedIn Corporation. All Rights Reserved. 14 New consumer Features:  Group management protocol  Consumer
  • 15. ©2016 LinkedIn Corporation. All Rights Reserved. 15 New consumer State Diagram - consumer
  • 16. ©2016 LinkedIn Corporation. All Rights Reserved. 16 New consumer Features:  Group management protocol  Consumer  Co-ordinator
  • 17. ©2016 LinkedIn Corporation. All Rights Reserved. 17 New consumer State Diagram – Co-ordinator
  • 18. ©2016 LinkedIn Corporation. All Rights Reserved. 18 New consumer Features:  Group management protocol  Consumer  Co-ordinator  Failure detection protocol
  • 19. ©2016 LinkedIn Corporation. All Rights Reserved. 19 New consumer Interesting scenarios:  Co-ordinator failover/connection loss  Partition changes for subscribed topics  Offset commit during rebalance  Hearbeats during rebalance  Slow consumers
  • 20. ©2016 LinkedIn Corporation. All Rights Reserved. 20 Notable improvements and fixes  Automated replica lag tuning (replica.lag.time.max.ms)  New purgatory design – low memory overhead  Auto-assign node ids  No data loss in Mirror Maker – unclean shutdown  Log compaction for compressed topics  Handling of corrupt index files
  • 21. ©2016 LinkedIn Corporation. All Rights Reserved. 21 Upgrading from kafka 0.8  inter.broker.protocol.version=0.8.2.x  Update code and restart  inter.broker.protocol.version=0.9.0.0  Restart brokers again
  • 22. ©2016 LinkedIn Corporation. All Rights Reserved. 22 Potential Breaking Changes:  Java 1.6 and Scala 2.9 are not supported  Broker IDs > 1000 ( reserved.broker.max.id and broker.id.generation.enable )  replica.lag.max.messages removed  replica.lag.time.max.ms  No compaction for topics without key Upgrading from kafka 0.8
  • 23. ©2016 LinkedIn Corporation. All Rights Reserved. 23 Potential Breaking Changes contd….  Changes in default JVM options Upgrading from kafka 0.8
  • 24. ©2016 LinkedIn Corporation. All Rights Reserved. 24 Why not kafka 0.10 ?
  • 25. ©2016 LinkedIn Corporation. All Rights Reserved. 25 Kafka 0.10 - Highlights  Kafka Streams  Rack Awareness  Timestamps in messages  SASL improvements  Kafka consumer max record  Protocol version improvements
  • 26. ©2016 LinkedIn Corporation. All Rights Reserved. 26 References  http://kafka.apache.org/090/documentation.html  http://www.confluent.io/blog/apache-kafka-0.9-is-released  http://www.confluent.io/blog/announcing-apache-kafka-0.10-and-confluent- platform-3.0  https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+R ewrite+Design
  • 27. ©2016 LinkedIn Corporation. All Rights Reserved.