SlideShare a Scribd company logo
www.edureka.co/r-for-analytics
www.edureka.co/apache-Kafka
How Apache Kafka is transforming
Hadoop, Spark & Storm
Slide 2Slide 2Slide 2 www.edureka.co/apache-Kafka
 Million Dollar Question! Why we need Kafka?
 What is Kafka?
 Kafka Architecture
 Kafka with Hadoop
 Kafka with Spark
 Kafka with Storm
 Companies using Kafka
 Demo on Kafka Messaging Service…
What will you learn today?
Million Dollar Question!
Why we need Kafka??
Slide 4Slide 4Slide 4 www.edureka.co/apache-Kafka
Why Kafka is preferred in place of
more traditional brokers like JMS
and AMQP
Why Kafka Cluster?
Slide 5Slide 5Slide 5 www.edureka.co/apache-Kafka
Kafka Producer Performance with Other Systems
Slide 6Slide 6Slide 6 www.edureka.co/apache-Kafka
Kafka Consumer Performance with Other Systems
Slide 7Slide 7Slide 7 www.edureka.co/apache-Kafka
Salient Features of Kafka
Feature Description
High Throughput Support for millions of messages with modest hardware
Scalability Highly scalable distributed systems with no downtime
Replication
Messages can be replicated across cluster, which provides support for multiple
subscribers and also in case of failure balances the consumers
Durability Provides support for persistence of messages to disk which can be further used for
batch consumption
Stream Processing Kafka can be used along with real time streaming applications like spark and storm
Data Loss Kafka with the proper configurations can ensure zero data loss
Slide 8Slide 8Slide 8 www.edureka.co/apache-Kafka
 With Kafka, we can easily handle hundreds and thousands of messages in a second
 The cluster can be expanded with no downtime, making Kafka highly scalable
 Messages are replicated, which provides reliability and durability
 Fault tolerant
Scalable
Kafka Advantages
What is Kafka?
Slide 10Slide 10Slide 10 www.edureka.co/apache-Kafka
 A distributed publish-subscribe messaging system
 Developed at LinkedIn Corporation
 Provides solution to handle all activity stream data
 Fully supported in Hadoop platform
 Partitions real time consumption across cluster of machines
 Provides a mechanism for parallel load into Hadoop
What is Kafka ?
Slide 11Slide 11Slide 11 www.edureka.co/apache-Kafka
Apache Kafka – Overview
Kafka
External
Tracking Proxy
Frontend FrontendFrontend
Background
Service
(Consumer)
Background
Service
(Consumer)
Hadoop DWH
Background
Service
(Producer)
Background
Service
(Producer)
Kafka Architecture
Slide 13Slide 13Slide 13 www.edureka.co/apache-Kafka
Kafka Architecture
Producer
(Front End)
Producer
(Services)
Producer
(Proxies)
Producer
(Adapters)
Other
Producer
Zookeeper
Consumers
(Real Time)
Consumers
(NoSQL)
Consumers
(Hadoop)
Consumers
(Warehouses)
Other
Producer
Kafka Kafka Kafka Kafka Broker
Slide 14Slide 14Slide 14 www.edureka.co/apache-Kafka
 Below table lists the core concepts of Kafka
Kafka Core Components
Feature Description
Topic A category or feed to which messages are published
Producer Publishes messages to the Kafka Topic
Consumer Subscribes and consumes messages from Kafka Topic
Broker Handles hundreds of megabytes of reads and writes
Slide 15Slide 15Slide 15 www.edureka.co/apache-Kafka
Kafka Topic
 A user defined category where the messages are published
 For each topic a partition log is maintained
 Each partition basically contains an ordered, immutable sequence of messages where each message is assigned a
sequential ID number called offset
 Writes to a partition are generally sequential thereby reducing the number of hard disk seeks
 Reading messages from partition can be random
Slide 16Slide 16Slide 16 www.edureka.co/apache-Kafka
 Applications publishes messages to the topic in kafka cluster.
 Can be of any kind like front end, streaming etc.
 While writing messages, it is also possible to attach a key with the
message
Same key will arrive in the same partition
 Doesn’t wait for the acknowledgement from the kafka cluster
 Publishes as much messages as fast as the broker in a cluster can handle
Kafka Producers
Kafka
Clusters
Producer
Producer
Producer
Slide 17Slide 17Slide 17 www.edureka.co/apache-Kafka
Kafka Consumers
 Applications subscribes and consumes messages from the brokers in
Kafka cluster
 Can be of any kind like real time consumers, NoSQL consumers, etc.
 During consumption of messages from a topic, a consumer group
can be configured with multiple consumers
 Each consumer of consumer group reads messages from a unique
subset of partitions in each topic they subscribe to
 Messages with same key arrives at same consumer
 Supports both Queuing and Publish-Subscribe
 Consumers have to maintain the number of messages consumed
Kafka Clusters
Consumer
Consumer
Consumer
Slide 18Slide 18Slide 18 www.edureka.co/apache-Kafka
Each server in the cluster is called a broker
 Handles hundreds of MBs of writes from producers and reads
from consumers
 Retains all published messages irrespective of whether it is
consumed or not
 Retention is configured for n days
 Published messages is available for consumptions for
configured ‘n’ days and thereafter it is discarded
 Works like a queue if consumer instances belong to same
consumer group, else works like publish-subscribe
Kafka Brokers
Slide 19Slide 19Slide 19 www.edureka.co/apache-Kafka
Kafka Producer-Broker-Consumer
Slide 20Slide 20Slide 20 www.edureka.co/apache-Kafka
How Kafka can be used with Hadoop
Slide 21Slide 21Slide 21 www.edureka.co/apache-Kafka
Kafka with Hadoop using Camus
 Camus is LinkedIn's Kafka ->HDFS pipeline
 It is a MapReduce job
Distributes data loads out of Kafka
At LinkedIn, it processes tens of billions of messages/day
All work done with one single Hadoop job
Courtesy : confluent
Slide 22Slide 22Slide 22 www.edureka.co/apache-Kafka
How Kafka can be used with Spark
Slide 23Slide 23Slide 23 www.edureka.co/apache-Kafka
Kafka With Spark Streaming
If messages are stored in ‘n’ partitions, parallel reading makes things faster
Generally in Kafka messages are stored in multiple partitions
Parallel reads can be effectively achieved by spark streaming
Parallelism of reads is achieved by integrating KafkaInputDStream of Spark with Kafka High Level Consumer API
Slide 24 www.edureka.co/apache-Kafka
APPS
Kafka
E V E N T S
STREAMING ENGINE
Kafka With Spark Streaming
Generally in Kafka messages are stored in multiple partitions
Slide 25Slide 25Slide 25 www.edureka.co/apache-Kafka
How Kafka can be used with Storm
Slide 26Slide 26Slide 26 www.edureka.co/apache-Kafka
Kafka With Spark Streaming
Slide 27Slide 27Slide 27 www.edureka.co/apache-Kafka
Companies Using Kafka
Slide 28Slide 28Slide 28 www.edureka.co/apache-Kafka
Get Certified in Apache Kafka from Edureka
Edureka's Real-Time Analytics with Apache Kafka course:
• Carefully designed to provide knowledge and skills to become a successful Kafka Big Data Developer
• Helps you master the concepts of Kafka Cluster, Producers and Consumers, Kafka API, Kafka Integration with Hadoop, Storm
and Spark
• Encompasses the fundamental concepts like Kafka cluster, Kafka API to advance topics such as Kafka integration with
Hadoop, Storm, Spark, Maven etc.
• Online Live Courses: 15 hours
• Assignments: 25 hours
• Project: 20 hours
• Lifetime Access + 24 X 7 Support
Go to www.edureka.co/apache-kafka
Batch starts from 10th October (Weekend Batch)
Thank You
Questions/Queries/Feedback/Survey
Recording and presentation will be made available to you within 24 hours

More Related Content

What's hot

Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
Gwen (Chen) Shapira
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
the100rabh
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
Real time analytics with Kafka and SparkStreaming
Real time analytics with Kafka and SparkStreamingReal time analytics with Kafka and SparkStreaming
Real time analytics with Kafka and SparkStreaming
Ashish Singh
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
Joe Stein
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
Jay Kreps
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
Ayyappadas Ravindran (Appu)
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
Gwen (Chen) Shapira
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
Slim Baltagi
 
Intro to Apache Kafka
Intro to Apache KafkaIntro to Apache Kafka
Intro to Apache Kafka
Jason Hubbard
 
101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
Gwen (Chen) Shapira
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
AWS Summits
 
Data Pipeline with Kafka
Data Pipeline with KafkaData Pipeline with Kafka
Data Pipeline with Kafka
Peerapat Asoktummarungsri
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)
W2O Group
 

What's hot (20)

Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Real time analytics with Kafka and SparkStreaming
Real time analytics with Kafka and SparkStreamingReal time analytics with Kafka and SparkStreaming
Real time analytics with Kafka and SparkStreaming
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Intro to Apache Kafka
Intro to Apache KafkaIntro to Apache Kafka
Intro to Apache Kafka
 
101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
 
Data Pipeline with Kafka
Data Pipeline with KafkaData Pipeline with Kafka
Data Pipeline with Kafka
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)
 

Similar to How Apache Kafka is transforming Hadoop, Spark and Storm

How kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & stormHow kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & storm
Edureka!
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging System
Edureka!
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
Edureka!
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafka
AmitDhodi
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Edureka!
 
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Denodo
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
Jenn Rawlins
 
Kafka overview
Kafka overviewKafka overview
Kafka overview
Shanki Singh Gandhi
 
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Red Hat Developers
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
Chetan Khatri
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers Guide
Inexture Solutions
 
Apache kafka
Apache kafkaApache kafka
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
HostedbyConfluent
 
Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)
GiuseppeBaccini
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
AnandMHadoop
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
Joan Viladrosa Riera
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
Koiuyt1
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
NguyenChiHoangMinh
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESB
Jitendra Bafna
 

Similar to How Apache Kafka is transforming Hadoop, Spark and Storm (20)

How kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & stormHow kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & storm
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging System
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafka
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
Kafka overview
Kafka overviewKafka overview
Kafka overview
 
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers Guide
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
 
Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESB
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
Edureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

How Apache Kafka is transforming Hadoop, Spark and Storm

  • 2. Slide 2Slide 2Slide 2 www.edureka.co/apache-Kafka  Million Dollar Question! Why we need Kafka?  What is Kafka?  Kafka Architecture  Kafka with Hadoop  Kafka with Spark  Kafka with Storm  Companies using Kafka  Demo on Kafka Messaging Service… What will you learn today?
  • 4. Slide 4Slide 4Slide 4 www.edureka.co/apache-Kafka Why Kafka is preferred in place of more traditional brokers like JMS and AMQP Why Kafka Cluster?
  • 5. Slide 5Slide 5Slide 5 www.edureka.co/apache-Kafka Kafka Producer Performance with Other Systems
  • 6. Slide 6Slide 6Slide 6 www.edureka.co/apache-Kafka Kafka Consumer Performance with Other Systems
  • 7. Slide 7Slide 7Slide 7 www.edureka.co/apache-Kafka Salient Features of Kafka Feature Description High Throughput Support for millions of messages with modest hardware Scalability Highly scalable distributed systems with no downtime Replication Messages can be replicated across cluster, which provides support for multiple subscribers and also in case of failure balances the consumers Durability Provides support for persistence of messages to disk which can be further used for batch consumption Stream Processing Kafka can be used along with real time streaming applications like spark and storm Data Loss Kafka with the proper configurations can ensure zero data loss
  • 8. Slide 8Slide 8Slide 8 www.edureka.co/apache-Kafka  With Kafka, we can easily handle hundreds and thousands of messages in a second  The cluster can be expanded with no downtime, making Kafka highly scalable  Messages are replicated, which provides reliability and durability  Fault tolerant Scalable Kafka Advantages
  • 10. Slide 10Slide 10Slide 10 www.edureka.co/apache-Kafka  A distributed publish-subscribe messaging system  Developed at LinkedIn Corporation  Provides solution to handle all activity stream data  Fully supported in Hadoop platform  Partitions real time consumption across cluster of machines  Provides a mechanism for parallel load into Hadoop What is Kafka ?
  • 11. Slide 11Slide 11Slide 11 www.edureka.co/apache-Kafka Apache Kafka – Overview Kafka External Tracking Proxy Frontend FrontendFrontend Background Service (Consumer) Background Service (Consumer) Hadoop DWH Background Service (Producer) Background Service (Producer)
  • 13. Slide 13Slide 13Slide 13 www.edureka.co/apache-Kafka Kafka Architecture Producer (Front End) Producer (Services) Producer (Proxies) Producer (Adapters) Other Producer Zookeeper Consumers (Real Time) Consumers (NoSQL) Consumers (Hadoop) Consumers (Warehouses) Other Producer Kafka Kafka Kafka Kafka Broker
  • 14. Slide 14Slide 14Slide 14 www.edureka.co/apache-Kafka  Below table lists the core concepts of Kafka Kafka Core Components Feature Description Topic A category or feed to which messages are published Producer Publishes messages to the Kafka Topic Consumer Subscribes and consumes messages from Kafka Topic Broker Handles hundreds of megabytes of reads and writes
  • 15. Slide 15Slide 15Slide 15 www.edureka.co/apache-Kafka Kafka Topic  A user defined category where the messages are published  For each topic a partition log is maintained  Each partition basically contains an ordered, immutable sequence of messages where each message is assigned a sequential ID number called offset  Writes to a partition are generally sequential thereby reducing the number of hard disk seeks  Reading messages from partition can be random
  • 16. Slide 16Slide 16Slide 16 www.edureka.co/apache-Kafka  Applications publishes messages to the topic in kafka cluster.  Can be of any kind like front end, streaming etc.  While writing messages, it is also possible to attach a key with the message Same key will arrive in the same partition  Doesn’t wait for the acknowledgement from the kafka cluster  Publishes as much messages as fast as the broker in a cluster can handle Kafka Producers Kafka Clusters Producer Producer Producer
  • 17. Slide 17Slide 17Slide 17 www.edureka.co/apache-Kafka Kafka Consumers  Applications subscribes and consumes messages from the brokers in Kafka cluster  Can be of any kind like real time consumers, NoSQL consumers, etc.  During consumption of messages from a topic, a consumer group can be configured with multiple consumers  Each consumer of consumer group reads messages from a unique subset of partitions in each topic they subscribe to  Messages with same key arrives at same consumer  Supports both Queuing and Publish-Subscribe  Consumers have to maintain the number of messages consumed Kafka Clusters Consumer Consumer Consumer
  • 18. Slide 18Slide 18Slide 18 www.edureka.co/apache-Kafka Each server in the cluster is called a broker  Handles hundreds of MBs of writes from producers and reads from consumers  Retains all published messages irrespective of whether it is consumed or not  Retention is configured for n days  Published messages is available for consumptions for configured ‘n’ days and thereafter it is discarded  Works like a queue if consumer instances belong to same consumer group, else works like publish-subscribe Kafka Brokers
  • 19. Slide 19Slide 19Slide 19 www.edureka.co/apache-Kafka Kafka Producer-Broker-Consumer
  • 20. Slide 20Slide 20Slide 20 www.edureka.co/apache-Kafka How Kafka can be used with Hadoop
  • 21. Slide 21Slide 21Slide 21 www.edureka.co/apache-Kafka Kafka with Hadoop using Camus  Camus is LinkedIn's Kafka ->HDFS pipeline  It is a MapReduce job Distributes data loads out of Kafka At LinkedIn, it processes tens of billions of messages/day All work done with one single Hadoop job Courtesy : confluent
  • 22. Slide 22Slide 22Slide 22 www.edureka.co/apache-Kafka How Kafka can be used with Spark
  • 23. Slide 23Slide 23Slide 23 www.edureka.co/apache-Kafka Kafka With Spark Streaming If messages are stored in ‘n’ partitions, parallel reading makes things faster Generally in Kafka messages are stored in multiple partitions Parallel reads can be effectively achieved by spark streaming Parallelism of reads is achieved by integrating KafkaInputDStream of Spark with Kafka High Level Consumer API
  • 24. Slide 24 www.edureka.co/apache-Kafka APPS Kafka E V E N T S STREAMING ENGINE Kafka With Spark Streaming Generally in Kafka messages are stored in multiple partitions
  • 25. Slide 25Slide 25Slide 25 www.edureka.co/apache-Kafka How Kafka can be used with Storm
  • 26. Slide 26Slide 26Slide 26 www.edureka.co/apache-Kafka Kafka With Spark Streaming
  • 27. Slide 27Slide 27Slide 27 www.edureka.co/apache-Kafka Companies Using Kafka
  • 28. Slide 28Slide 28Slide 28 www.edureka.co/apache-Kafka Get Certified in Apache Kafka from Edureka Edureka's Real-Time Analytics with Apache Kafka course: • Carefully designed to provide knowledge and skills to become a successful Kafka Big Data Developer • Helps you master the concepts of Kafka Cluster, Producers and Consumers, Kafka API, Kafka Integration with Hadoop, Storm and Spark • Encompasses the fundamental concepts like Kafka cluster, Kafka API to advance topics such as Kafka integration with Hadoop, Storm, Spark, Maven etc. • Online Live Courses: 15 hours • Assignments: 25 hours • Project: 20 hours • Lifetime Access + 24 X 7 Support Go to www.edureka.co/apache-kafka Batch starts from 10th October (Weekend Batch)
  • 29. Thank You Questions/Queries/Feedback/Survey Recording and presentation will be made available to you within 24 hours