1. Apache KAFKA
Apache kafka is a scalable, fast, durable, and fault-tolerant subscribe messaging
system on later stage. Traditional message brokers often uses Kafka like AMQP and
JMS as its higher result, reliability and replication. Apache Hbase, Apache storm has
a combination of work with Kafka for rendering streaming data and real-time
analysis. Geospatial data can be messaged from a fleet of long-haul trucks or data
sensor from cooling and heating equipment in office buildings. Whatever may be the
scenario, for low-latency analysis for massive message streams in Enterprise Apache
Hadoop.
What KAFKA Does ?
There are a wide range of use cases and a general-purpose messaging system is
supported by Apache Kafka and it has a high throughput, reliable delivery, horizontal
scalability is significant. Apache Hbase and Apache Storm has a good work
compatibility with Kafka. These are common use cases:
• Website Activity Tracking
• Stream Processing
• Log Aggregation
• Metrics Collection and Monitoring
Some of the Significant Characteristics of Kafka making have an attractive option
for use cases with the following:
1) Scalability: With no downtime distributed system scales easily.
2) Durability: Provides intra-cluster replication and persists messages on disk.
3) Reliability: Supports multiple subscribers, Replicates data, and balances automatic
consumers in case of failure.
4) Performance: For publishing and subscribing you require high throughput along
with disk structures for offering constant performance even with many terabytes of
stored messages.
2. KAFKA WORKING
It can be considered as a shared commit log, and incoming data has a sequential entry
into disk. Here are four main components involved in moving data in and out of
Kafka:
• Producers
• Topics
• Consumers
• Brokers
A Topic is a user defined category in Kafka where messages are published and it is
done by Kafka Producers for one or more topics and Consumers give their
subscription to process and topics for the published messages. A cluster of Kafka
comprises of more servers known as Brokers for handling the replication and
persistence of data message.
Kafka’s performance would be high and it is the simplicity of the broker’s
responsibility. The topics of Kafka comprises of more partitions that are ordered with
immutable message sequences. Sequential writes are available in partition because
the design greatly reduces the number of hard disk seeks.
Kafka’s performance contributing factor and scalability is the fact that brokers of
Kafka keep track of the messages that has been consumed and the consumer must
take the responsibility. JMS, traditional messaging system have the broker bore the
responsibility, there by strictly restricting the ability to scale as there is a steady
increase in consumers.
Join the Institute of DBA to know more about this field and become a Certified
DBA Professional over here.
Stay connected to CRB Tech for more technical optimization and other updates and information.