SlideShare a Scribd company logo
1 of 16
Kafka
Streaming Data Platform
Traditional Messaging System
• Queue
• Topic
• After Consumed Removed
• Out of order messaging
What is Kafka
• Messaging system
• Polyglot Consumers / Producers
• Topics and Partitions
• Scalable
• Configurable Message Retention
• Guaranteed order
Use Cases
• Ordered Messaging
• Log Aggregation
• Metrics
• Web Activity Tracking
• Stream Processing
Kafka Brokers – Clusters and Replication
• Topics can be replicated
• Data stored across various nodes
• Kafka clusters require broker.id=0
• Zookeeper
• Offsets
• Topic names
• partitions
Demo – Local Kafka
• Startup zookeeper
• bin/zookeeper-server-start.sh config/zookeeper.properties
• Start kafka
• bin/kafka-server-start.sh config/server.properties
Demo Command line tools
• bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-
factor 1 --partitions 1 --topic test
• bin/kafka-topics.sh --list --zookeeper localhost:2181
• bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
• bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic
test --from-beginning
Example Producer
• <CODE>
Example Consumer
• <CODE>
Deployment Options
• Stand alone deployment
• Confluent.io
• Horton Works
• AWS
HortonWorks Data Platform on AWS
Big Data in a one stop shop
Determine Cluster Sizing
• Implement a producer and consumer
• Use your data structures
• 3 Zookeeper nodes and 3 Kafka nodes
• Java Heap = 2GB
• Network Saturation (1 gigabit / 10 gigabit)
• Avro Data Serialization
Producer for testing throughput
• <CODE>
Architectural Possibilities
• Streaming data platform
• Common interface
• High throughput
WARNING
• Kafka 0.8.x has a major bug…deletes data
• Make sure to use 0.9.0.x
Question & Answer
bryancjacobs@gmail.com

More Related Content

What's hot

Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containersconfluent
 
Modern Deployment Strategies
Modern Deployment StrategiesModern Deployment Strategies
Modern Deployment StrategiesPerforce
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
RGW S3: Features vs deep compatibility - Robin Johnson
RGW S3: Features vs deep compatibility  - Robin JohnsonRGW S3: Features vs deep compatibility  - Robin Johnson
RGW S3: Features vs deep compatibility - Robin JohnsonCeph Community
 
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deploymentKarthik .P.R
 
Big data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingBig data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingKamal A
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructuremattlieber
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
DalmatinerDB and cockroachDB monitoring plataform
DalmatinerDB and cockroachDB monitoring plataformDalmatinerDB and cockroachDB monitoring plataform
DalmatinerDB and cockroachDB monitoring plataformLeandro Totino Pereira
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarMatteo Merli
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
High performance messaging with Apache Pulsar
High performance messaging with Apache PulsarHigh performance messaging with Apache Pulsar
High performance messaging with Apache PulsarMatteo Merli
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialColin Charles
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemStreamNative
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshareAllen (Xiaozhong) Wang
 
OSGifying the repository
OSGifying the repositoryOSGifying the repository
OSGifying the repositoryJukka Zitting
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R
 
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis LabsRedis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis LabsRedis Labs
 

What's hot (20)

Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
 
Modern Deployment Strategies
Modern Deployment StrategiesModern Deployment Strategies
Modern Deployment Strategies
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Pika
PikaPika
Pika
 
RGW S3: Features vs deep compatibility - Robin Johnson
RGW S3: Features vs deep compatibility  - Robin JohnsonRGW S3: Features vs deep compatibility  - Robin Johnson
RGW S3: Features vs deep compatibility - Robin Johnson
 
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deployment
 
Big data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingBig data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_training
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
DalmatinerDB and cockroachDB monitoring plataform
DalmatinerDB and cockroachDB monitoring plataformDalmatinerDB and cockroachDB monitoring plataform
DalmatinerDB and cockroachDB monitoring plataform
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache Pulsar
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
High performance messaging with Apache Pulsar
High performance messaging with Apache PulsarHigh performance messaging with Apache Pulsar
High performance messaging with Apache Pulsar
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability Tutorial
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
OSGifying the repository
OSGifying the repositoryOSGifying the repository
OSGifying the repository
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)
 
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis LabsRedis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Day Keynote Salvatore Sanfillipo Redis Labs
 

Viewers also liked

Lotus Forms Webform Server 3.0 Overview & Architecture
Lotus Forms Webform Server 3.0 Overview & ArchitectureLotus Forms Webform Server 3.0 Overview & Architecture
Lotus Forms Webform Server 3.0 Overview & Architectureddrschiw
 
Technical product manager
Technical product managerTechnical product manager
Technical product managerMark Long
 
Building Faster Horses: Taking Over An Existing Software Product
Building Faster Horses: Taking Over An Existing Software ProductBuilding Faster Horses: Taking Over An Existing Software Product
Building Faster Horses: Taking Over An Existing Software ProductStacy Vicknair
 
Algorithm - Introduction
Algorithm - IntroductionAlgorithm - Introduction
Algorithm - IntroductionMadhu Bala
 
Introduction To Algorithm [2]
Introduction To Algorithm [2]Introduction To Algorithm [2]
Introduction To Algorithm [2]ecko_disasterz
 
University Course Timetabling by using Multi Objective Genetic Algortihms
University Course Timetabling by using Multi Objective Genetic AlgortihmsUniversity Course Timetabling by using Multi Objective Genetic Algortihms
University Course Timetabling by using Multi Objective Genetic AlgortihmsHalil Kaşkavalcı
 
VMworld 2015: vSphere Web Client- Yesterday, Today, and Tomorrow
VMworld 2015: vSphere Web Client- Yesterday, Today, and TomorrowVMworld 2015: vSphere Web Client- Yesterday, Today, and Tomorrow
VMworld 2015: vSphere Web Client- Yesterday, Today, and TomorrowVMworld
 
Enterprise Architecture: The role of the Design Authority
Enterprise Architecture:The role of the Design AuthorityEnterprise Architecture:The role of the Design Authority
Enterprise Architecture: The role of the Design AuthorityInvestnet
 
Introduction to Algorithm
Introduction to AlgorithmIntroduction to Algorithm
Introduction to AlgorithmEducation Front
 
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten KoboRakuten Group, Inc.
 
[Rakuten TechConf2014] [B-1] Performance at scale
[Rakuten TechConf2014] [B-1] Performance at scale[Rakuten TechConf2014] [B-1] Performance at scale
[Rakuten TechConf2014] [B-1] Performance at scaleRakuten Group, Inc.
 
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013Cosmin Lehene
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingMark Kerzner
 
HTML5 로 iPhone App 만들기
HTML5 로 iPhone App 만들기HTML5 로 iPhone App 만들기
HTML5 로 iPhone App 만들기JungHyuk Kwon
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling TwitterBlaine
 

Viewers also liked (20)

IEA DSM Task 24 Transport Panel at BECC conference
IEA DSM Task 24 Transport Panel at BECC conferenceIEA DSM Task 24 Transport Panel at BECC conference
IEA DSM Task 24 Transport Panel at BECC conference
 
Core Management - Task 1
Core Management - Task 1Core Management - Task 1
Core Management - Task 1
 
Lotus Forms Webform Server 3.0 Overview & Architecture
Lotus Forms Webform Server 3.0 Overview & ArchitectureLotus Forms Webform Server 3.0 Overview & Architecture
Lotus Forms Webform Server 3.0 Overview & Architecture
 
Technical product manager
Technical product managerTechnical product manager
Technical product manager
 
Building Faster Horses: Taking Over An Existing Software Product
Building Faster Horses: Taking Over An Existing Software ProductBuilding Faster Horses: Taking Over An Existing Software Product
Building Faster Horses: Taking Over An Existing Software Product
 
docker
dockerdocker
docker
 
Algorithm - Introduction
Algorithm - IntroductionAlgorithm - Introduction
Algorithm - Introduction
 
Introduction To Algorithm [2]
Introduction To Algorithm [2]Introduction To Algorithm [2]
Introduction To Algorithm [2]
 
University Course Timetabling by using Multi Objective Genetic Algortihms
University Course Timetabling by using Multi Objective Genetic AlgortihmsUniversity Course Timetabling by using Multi Objective Genetic Algortihms
University Course Timetabling by using Multi Objective Genetic Algortihms
 
VMworld 2015: vSphere Web Client- Yesterday, Today, and Tomorrow
VMworld 2015: vSphere Web Client- Yesterday, Today, and TomorrowVMworld 2015: vSphere Web Client- Yesterday, Today, and Tomorrow
VMworld 2015: vSphere Web Client- Yesterday, Today, and Tomorrow
 
Enterprise Architecture: The role of the Design Authority
Enterprise Architecture:The role of the Design AuthorityEnterprise Architecture:The role of the Design Authority
Enterprise Architecture: The role of the Design Authority
 
algorithm
algorithmalgorithm
algorithm
 
Introduction to Algorithm
Introduction to AlgorithmIntroduction to Algorithm
Introduction to Algorithm
 
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo
[Rakuten TechConf2014] [D-6] Rakuten BaaS in ROOM & Rakuten Kobo
 
[Rakuten TechConf2014] [B-1] Performance at scale
[Rakuten TechConf2014] [B-1] Performance at scale[Rakuten TechConf2014] [B-1] Performance at scale
[Rakuten TechConf2014] [B-1] Performance at scale
 
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013
Real-time “OLAP” for Big Data (+ use cases) - bigdata.ro 2013
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
HTML5 로 iPhone App 만들기
HTML5 로 iPhone App 만들기HTML5 로 iPhone App 만들기
HTML5 로 iPhone App 만들기
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling Twitter
 

Similar to kafka-steaming-data

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfGuozhang Wang
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
Debugging applications with network security tools
Debugging applications with network security toolsDebugging applications with network security tools
Debugging applications with network security toolsConFoo
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
Custom management apps for Kafka
Custom management apps for KafkaCustom management apps for Kafka
Custom management apps for KafkaSotaro Kimura
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptxNParakh1
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuideInexture Solutions
 
Feeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaFeeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaDataStax Academy
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaLevon Avakyan
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streamsjimriecken
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Schema registry
Schema registrySchema registry
Schema registryWhiteklay
 
ActiveMQ 5.9.x new features
ActiveMQ 5.9.x new featuresActiveMQ 5.9.x new features
ActiveMQ 5.9.x new featuresChristian Posta
 
Kafka indexing service
Kafka indexing serviceKafka indexing service
Kafka indexing serviceSeoeun Park
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through KafkaDileep Kalidindi
 

Similar to kafka-steaming-data (20)

Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Debugging applications with network security tools
Debugging applications with network security toolsDebugging applications with network security tools
Debugging applications with network security tools
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Custom management apps for Kafka
Custom management apps for KafkaCustom management apps for Kafka
Custom management apps for Kafka
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers Guide
 
Feeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaFeeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and Kafka
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Schema registry
Schema registrySchema registry
Schema registry
 
ActiveMQ 5.9.x new features
ActiveMQ 5.9.x new featuresActiveMQ 5.9.x new features
ActiveMQ 5.9.x new features
 
Next-Gen DHCP
Next-Gen DHCPNext-Gen DHCP
Next-Gen DHCP
 
Kafka indexing service
Kafka indexing serviceKafka indexing service
Kafka indexing service
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through Kafka
 

kafka-steaming-data