SlideShare a Scribd company logo
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
1
KAFKA INFRASTRUCTURE:
MONITORING
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
1. Important metrics
2. Open source kafka tools
3. The Landoop Stack
4. The Confluent Stack
2
$intro --help
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
Kafka metrics:
- UnderReplicatedPartitions: In a healthy cluster, the number of in sync replicas
(ISRs) should be exactly equal to the total number of replicas. If partition replicas
fall too far behind their leaders, the follower partition is removed from the ISR
pool, and you should see a corresponding increase in IsrShrinksPerSec.
- IsrShrinksPerSec/IsrExpandsPerSec: The number of in-sync replicas (ISRs) for a
particular partition should remain fairly static, the only exceptions are when you
are expanding your broker cluster or removing partitions. An increase in
IsrShrinksPerSec without a corresponding increase in IsrExpandsPerSec shortly
thereafter is cause for concern and requires user intervention.
- ActiveControllerCount: The first node to boot in a Kafka cluster automatically
becomes the controller, and there can be only one. The controller in a Kafka
cluster is responsible for maintaining the list of partition leaders, and coordinating
leadership transitions
3
KAFKA MONITORING: Brokers
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- OfflinePartitionsCount (controller only): This metric reports the number of
partitions without an active leader. Because all read and write operations are only
performed on partition leaders, a non-zero value for this metric should be alerted
on to prevent service interruptions.
- LeaderElectionRateAndTimeMs: Reports the rate of leader elections (per second)
and the total time the cluster went without a leader (in milliseconds).
- UncleanLeaderElectionsPerSec: An unclean leader election is a special case in
which no available replicas are in sync. Because each topic must have a leader, an
election is held among the out-of-sync replicas and a leader is chosen—meaning
any messages that were not synced prior to the loss of the former leader are lost
forever.
4
KAFKA MONITORING: Brokers
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- TotalTimeMs: The TotalTimeMs metric family measures the total time taken to
service a request (be it a produce, fetch-consumer, or fetch-follower request).
- BytesInPerSec/BytesOutPerSec: Tracking network throughput on your brokers
gives you more information as to where potential bottlenecks may lie, and can
inform decisions like whether or not you should enable end-to-end compression
of your messages.
- Disk usage: Kafka will fail should its disk become full, so keeping track of disk
growth over time is recommended.
- Network bytes sent/received: If you are monitoring Kafka’s bytes in/out metric,
you are getting Kafka’s side of the story. To get a full picture of network usage on
your host, you would need to monitor host-level network throughput
5
KAFKA MONITORING: Brokers
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- Response rate: For producers, the response rate represents the rate of responses
received from brokers. Brokers respond to producers when the data has been
received.
- Request rate: The request rate is the rate at which producers send data to
brokers. Keeping an eye on peaks and drops is essential to ensure continuous
service availability.
- Request latency average: The average request latency is a measure of the
amount of time between when KafkaProducer.send() was called until the
producer receives a response from the broker.
- Outgoing byte rate: As with Kafka brokers, you will want to monitor your
producer network throughput. Observing traffic volume over time is essential to
determine if changes to your network infrastructure are needed.
6
KAFKA MONITORING: Producers
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- ConsumerLag: ConsumerLag is the calculated difference between a consumer’s
current log offset and a producer’s current log offset.
- MaxLag: Goes hand-in-hand with ConsumerLag, and is the maximum observed
value of ConsumerLag.
- BytesPerSec: As with producers and brokers, you will want to monitor your
consumer network throughput.
- MessagesPerSec: The rate of messages consumed per second may not strongly
correlate with the rate of bytes consumed because messages can be of variable
size.
- MinFetchRate: The fetch rate of a consumer can be a good indicator of overall
consumer health.
7
KAFKA MONITORING: Consumers
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- zk_outstanding_requests: Clients can end up submitting requests faster than
ZooKeeper can process them. If you have a large number of clients, it’s almost a
given that this will happen occasionally. To prevent using up all available memory
due to queued requests, ZooKeeper will throttle clients if its queue limit is
reached.
- zk_avg_latency: The average request latency is the average time it takes (in
milliseconds) for ZooKeeper to respond to a request. ZooKeeper will not respond
to a request until it has written the transaction to its transaction log.
- zk_num_alive_connections: ZooKeeper reports the number of clients connected
to it via the zk_num_alive_connections metric. This represents all connections,
including connections to non-ZooKeeper nodes.
- zk_followers (leader only): The number of followers should equal the total size of
your ZooKeeper ensemble - 1 (the leader is not included in the follower count).
8
KAFKA MONITORING: Zookeeper
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
- zk_pending_syncs (leader only): The transaction log is the most performance-
critical part of ZooKeeper. ZooKeeper must sync transactions to disk before
returning a response, thus a large number of pending syncs will result in latencies
increases across the board.
- Bytes sent/received (v0.8.x only): Brokers and consumers communicate with
ZooKeeper. In large-scale deployments with many consumers and partitions, this
constant communication means ZooKeeper could become a bottleneck.
- Usable memory: ZooKeeper should reside entirely in RAM and will suffer
considerably if it must page to disk. Therefore, keeping track of the amount of
usable memory is necessary to ensure ZooKeeper performs optimally.
- Disk latency: Although ZooKeeper should reside in RAM, it still makes use of the
filesystem for both periodically snapshotting its current state and for maintaining
logs of all transactions.
9
KAFKA MONITORING: Zookeeper
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
Yahoo Kafka Manager (https://github.com/yahoo/kafka-manager):
- Manage multiple clusters
- Easy inspection of cluster state (topics, consumers, offsets, brokers, replica
distribution, partition distribution)
- Run preferred replica election
- Generate partition assignments with option to select brokers to use
- Run reassignment of partition (based on generated assignments)
- Create a topic with optional topic configs
- Delete topic
- Topic list now indicates topics marked for deletion (only supported on 0.8.2+)
- Batch generate partition assignments for multiple topics with option to select
brokers to use
- Batch run reassignment of partition for multiple topics
- Add partitions to existing topic
- Update config for existing topic
10
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
Yahoo Kafka Manager (https://github.com/yahoo/kafka-manager):
11
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
LinkedIn Burrow (https://github.com/linkedin/Burrow):
Burrow is a monitoring companion for Apache Kafka that provides consumer lag checking as a service without
the need for specifying thresholds. It monitors committed offsets for all consumers and calculates the status of
those consumers on demand. An HTTP endpoint is provided to request status on demand, as well as provide
other Kafka cluster information. There are also configurable notifiers that can send status out via email or HTTP
calls to another service.
- Multiple Kafka Cluster support
- Automatically monitors all consumers using Kafka-committed offsets
- Configurable support for Zookeeper-committed offsets
- Configurable support for Storm-committed offsets
- HTTP endpoint for consumer group status, as well as broker and consumer
information
- Configurable emailer for sending alerts for specific groups
- Configurable HTTP client for sending alerts to another system for all groups
12
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
LinkedIn Burrow (https://github.com/linkedin/Burrow):
13
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
KafDrop (https://github.com/HomeAdvisor/Kafdrop):
Kafdrop is a UI for monitoring Apache Kafka clusters. The tool displays information such as brokers, topics,
partitions, and even lets you view messages. It is a light weight application that runs on Spring Boot and
requires very little configuration.
14
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
LinkedIn’s Kafka Monitor (https://github.com/linkedin/kafka-monitor):
Kafka Monitor is a framework to implement and execute long-running kafka system tests in a real cluster. It
complements Kafka’s existing system tests by capturing potential bugs or regressions that are only likely to
occur after prolonged period of time or with low probability. Moreover, it allows you to monitor Kafka cluster
using end-to-end pipelines to obtain a number of derived vital stats such as end-to-end latency, service
availability and message loss rate. You can easily deploy Kafka Monitor to test and monitor your Kafka cluster
without requiring any change to your application.
Kafka Monitor can automatically create the monitor topic with the specified config and increase partition count
of the monitor topic to ensure partition# >= broker#. It can also reassign partition and trigger preferred leader
election to ensure that each broker acts as leader of at least one partition of the monitor topic. This allows
Kafka Monitor to detect performance issue on every broker without requiring users to manually manage the
partition assignment of the monitor topic.
15
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
LinkedIn’s Kafka Monitor (https://github.com/linkedin/kafka-monitor):
16
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
Grafana + Prometheus (
https://github.com/grafana/grafana
https://github.com/prometheus/prometheus
):
- Grafana: Open source, feature rich metrics dashboard and graph editor for
Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB.
- Prometheus: Prometheus, a Cloud Native Computing Foundation project, is a
systems and service monitoring system. It collects metrics from configured targets
at given intervals, evaluates rule expressions, displays the results, and can trigger
alerts if some condition is observed to be true.
Demo kafka repository (with slack integration):
https://github.com/lucrussell/slack-chatops
17
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
Grafana + Prometheus:
18
MONITORING TOOLS: Open source
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
19
MONITORING TOOLS: Open source
ELK (
https://github.com/elastic/elasticsearch
https://github.com/elastic/logstash
https://github.com/elastic/kibana
):
- Elasticsearh: distributed RESTful search engine built for the cloud.
- Logstash: Logstash is part of the Elastic Stack along with Beats, Elasticsearch and
Kibana. Logstash is a server-side data processing pipeline that ingests data from a
multitude of sources simultaneously, transforms it, and then sends it to your
favorite "stash."
- Kibana: Window into the Elastic Stack. Specifically, it's a browser-based analytics
and search dashboard for Elasticsearch.
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
20
MONITORING TOOLS: Open source
ELK :
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
21
LANDOOP STACK
Landoop Lenses:
Lenses is an enterprise grade product that provides faster streaming application
deliveries and data flow management that natively integrates over Apache Kafka.
Lenses supports the core elements of Kafka with a rich user interface, endpoints and
vital enterprise capabilities that enable engineering and data teams to query real time
data, create and monitor Kafka topologies with rich integrations with other systems.
Fast Data-dev:
Running a demo development environment:
$docker run --rm --net=host landoop/fast-data-dev
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
22
LANDOOP STACK
Landoop Lenses:
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
23
CONFLUET STACK
Confluent Platform: Streaming platform that enables you to organize and manage
data from many different sources with one reliable, high performance system.
Bundle:
- Kafka Connectors
- Kafka Clients
- Schema Registry
- REST Proxy
Enterprise:
- Automatic Data Balancing
- Multi Datacenter Replication
- Confluent Control Center
- JMS Client
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
24
CONFLUET STACK
Confluent Control Center:
Confluent Control Center is a GUI-based system for managing and monitoring Apache
Kafka. It allows you to easily manage Kafka Connect, to create, edit, and manage
connections to other systems. It also allows you to monitor data streams from
producer to consumer, assuring that every message is delivered, and measuring how
long it takes to deliver messages. Using Control Center, you can build a production
data pipeline based on Apache Kafka without writing a line of code. Control Center
also has the capability to define alerts on the latency and completeness statistics of
data streams, which can be delivered by email or queried from a centralized alerting
system.
Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028
Telf: 91 080 82 44
Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006
Telf: 933 68 52 46
25
CONFLUET STACK
Confluent Control Center:

More Related Content

Similar to Kafka infrastructure monitoring

Database and Systems Integration Technologies.pptx
Database and Systems Integration Technologies.pptxDatabase and Systems Integration Technologies.pptx
Database and Systems Integration Technologies.pptxDatabase Homework Help
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...ScyllaDB
 
PDO Predictive Analytics Share for the Annual Research Forum 2015
PDO Predictive Analytics Share for the Annual Research Forum 2015PDO Predictive Analytics Share for the Annual Research Forum 2015
PDO Predictive Analytics Share for the Annual Research Forum 2015Faris Al-Kharusi
 
Where is my MQ message on z/OS?
Where is my MQ message on z/OS?Where is my MQ message on z/OS?
Where is my MQ message on z/OS?Matt Leming
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationKnoldus Inc.
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...arnaudsoullie
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...HostedbyConfluent
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of dataconfluent
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdfTarekHamdi8
 
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...aaajjj4
 
CA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Technologies
 
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVMScala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVMRUDDER
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 

Similar to Kafka infrastructure monitoring (20)

Rate limits and all about
Rate limits and all aboutRate limits and all about
Rate limits and all about
 
Database and Systems Integration Technologies.pptx
Database and Systems Integration Technologies.pptxDatabase and Systems Integration Technologies.pptx
Database and Systems Integration Technologies.pptx
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
 
PDO Predictive Analytics Share for the Annual Research Forum 2015
PDO Predictive Analytics Share for the Annual Research Forum 2015PDO Predictive Analytics Share for the Annual Research Forum 2015
PDO Predictive Analytics Share for the Annual Research Forum 2015
 
Where is my MQ message on z/OS?
Where is my MQ message on z/OS?Where is my MQ message on z/OS?
Where is my MQ message on z/OS?
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configuration
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdf
 
IoT Austin CUG talk
IoT Austin CUG talkIoT Austin CUG talk
IoT Austin CUG talk
 
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
 
Cassandra in xPatterns
Cassandra in xPatternsCassandra in xPatterns
Cassandra in xPatterns
 
CA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and BetterCA Spectrum® Just Keeps Getting Better and Better
CA Spectrum® Just Keeps Getting Better and Better
 
A new perspective on Network Visibility - RISK 2015
A new perspective on Network Visibility - RISK 2015A new perspective on Network Visibility - RISK 2015
A new perspective on Network Visibility - RISK 2015
 
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVMScala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 

Recently uploaded

how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfMehmet Akar
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAlluxio, Inc.
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...rajkumar669520
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems ApproachNeo4j
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfQ-Advise
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabbereGrabber
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionWave PLM
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion Clinic
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignNeo4j
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfDeskTrack
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...Alluxio, Inc.
 

Recently uploaded (20)

how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 

Kafka infrastructure monitoring

  • 1. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 1 KAFKA INFRASTRUCTURE: MONITORING
  • 2. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 1. Important metrics 2. Open source kafka tools 3. The Landoop Stack 4. The Confluent Stack 2 $intro --help
  • 3. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 Kafka metrics: - UnderReplicatedPartitions: In a healthy cluster, the number of in sync replicas (ISRs) should be exactly equal to the total number of replicas. If partition replicas fall too far behind their leaders, the follower partition is removed from the ISR pool, and you should see a corresponding increase in IsrShrinksPerSec. - IsrShrinksPerSec/IsrExpandsPerSec: The number of in-sync replicas (ISRs) for a particular partition should remain fairly static, the only exceptions are when you are expanding your broker cluster or removing partitions. An increase in IsrShrinksPerSec without a corresponding increase in IsrExpandsPerSec shortly thereafter is cause for concern and requires user intervention. - ActiveControllerCount: The first node to boot in a Kafka cluster automatically becomes the controller, and there can be only one. The controller in a Kafka cluster is responsible for maintaining the list of partition leaders, and coordinating leadership transitions 3 KAFKA MONITORING: Brokers
  • 4. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - OfflinePartitionsCount (controller only): This metric reports the number of partitions without an active leader. Because all read and write operations are only performed on partition leaders, a non-zero value for this metric should be alerted on to prevent service interruptions. - LeaderElectionRateAndTimeMs: Reports the rate of leader elections (per second) and the total time the cluster went without a leader (in milliseconds). - UncleanLeaderElectionsPerSec: An unclean leader election is a special case in which no available replicas are in sync. Because each topic must have a leader, an election is held among the out-of-sync replicas and a leader is chosen—meaning any messages that were not synced prior to the loss of the former leader are lost forever. 4 KAFKA MONITORING: Brokers
  • 5. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - TotalTimeMs: The TotalTimeMs metric family measures the total time taken to service a request (be it a produce, fetch-consumer, or fetch-follower request). - BytesInPerSec/BytesOutPerSec: Tracking network throughput on your brokers gives you more information as to where potential bottlenecks may lie, and can inform decisions like whether or not you should enable end-to-end compression of your messages. - Disk usage: Kafka will fail should its disk become full, so keeping track of disk growth over time is recommended. - Network bytes sent/received: If you are monitoring Kafka’s bytes in/out metric, you are getting Kafka’s side of the story. To get a full picture of network usage on your host, you would need to monitor host-level network throughput 5 KAFKA MONITORING: Brokers
  • 6. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - Response rate: For producers, the response rate represents the rate of responses received from brokers. Brokers respond to producers when the data has been received. - Request rate: The request rate is the rate at which producers send data to brokers. Keeping an eye on peaks and drops is essential to ensure continuous service availability. - Request latency average: The average request latency is a measure of the amount of time between when KafkaProducer.send() was called until the producer receives a response from the broker. - Outgoing byte rate: As with Kafka brokers, you will want to monitor your producer network throughput. Observing traffic volume over time is essential to determine if changes to your network infrastructure are needed. 6 KAFKA MONITORING: Producers
  • 7. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - ConsumerLag: ConsumerLag is the calculated difference between a consumer’s current log offset and a producer’s current log offset. - MaxLag: Goes hand-in-hand with ConsumerLag, and is the maximum observed value of ConsumerLag. - BytesPerSec: As with producers and brokers, you will want to monitor your consumer network throughput. - MessagesPerSec: The rate of messages consumed per second may not strongly correlate with the rate of bytes consumed because messages can be of variable size. - MinFetchRate: The fetch rate of a consumer can be a good indicator of overall consumer health. 7 KAFKA MONITORING: Consumers
  • 8. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - zk_outstanding_requests: Clients can end up submitting requests faster than ZooKeeper can process them. If you have a large number of clients, it’s almost a given that this will happen occasionally. To prevent using up all available memory due to queued requests, ZooKeeper will throttle clients if its queue limit is reached. - zk_avg_latency: The average request latency is the average time it takes (in milliseconds) for ZooKeeper to respond to a request. ZooKeeper will not respond to a request until it has written the transaction to its transaction log. - zk_num_alive_connections: ZooKeeper reports the number of clients connected to it via the zk_num_alive_connections metric. This represents all connections, including connections to non-ZooKeeper nodes. - zk_followers (leader only): The number of followers should equal the total size of your ZooKeeper ensemble - 1 (the leader is not included in the follower count). 8 KAFKA MONITORING: Zookeeper
  • 9. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 - zk_pending_syncs (leader only): The transaction log is the most performance- critical part of ZooKeeper. ZooKeeper must sync transactions to disk before returning a response, thus a large number of pending syncs will result in latencies increases across the board. - Bytes sent/received (v0.8.x only): Brokers and consumers communicate with ZooKeeper. In large-scale deployments with many consumers and partitions, this constant communication means ZooKeeper could become a bottleneck. - Usable memory: ZooKeeper should reside entirely in RAM and will suffer considerably if it must page to disk. Therefore, keeping track of the amount of usable memory is necessary to ensure ZooKeeper performs optimally. - Disk latency: Although ZooKeeper should reside in RAM, it still makes use of the filesystem for both periodically snapshotting its current state and for maintaining logs of all transactions. 9 KAFKA MONITORING: Zookeeper
  • 10. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 Yahoo Kafka Manager (https://github.com/yahoo/kafka-manager): - Manage multiple clusters - Easy inspection of cluster state (topics, consumers, offsets, brokers, replica distribution, partition distribution) - Run preferred replica election - Generate partition assignments with option to select brokers to use - Run reassignment of partition (based on generated assignments) - Create a topic with optional topic configs - Delete topic - Topic list now indicates topics marked for deletion (only supported on 0.8.2+) - Batch generate partition assignments for multiple topics with option to select brokers to use - Batch run reassignment of partition for multiple topics - Add partitions to existing topic - Update config for existing topic 10 MONITORING TOOLS: Open source
  • 11. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 Yahoo Kafka Manager (https://github.com/yahoo/kafka-manager): 11 MONITORING TOOLS: Open source
  • 12. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 LinkedIn Burrow (https://github.com/linkedin/Burrow): Burrow is a monitoring companion for Apache Kafka that provides consumer lag checking as a service without the need for specifying thresholds. It monitors committed offsets for all consumers and calculates the status of those consumers on demand. An HTTP endpoint is provided to request status on demand, as well as provide other Kafka cluster information. There are also configurable notifiers that can send status out via email or HTTP calls to another service. - Multiple Kafka Cluster support - Automatically monitors all consumers using Kafka-committed offsets - Configurable support for Zookeeper-committed offsets - Configurable support for Storm-committed offsets - HTTP endpoint for consumer group status, as well as broker and consumer information - Configurable emailer for sending alerts for specific groups - Configurable HTTP client for sending alerts to another system for all groups 12 MONITORING TOOLS: Open source
  • 13. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 LinkedIn Burrow (https://github.com/linkedin/Burrow): 13 MONITORING TOOLS: Open source
  • 14. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 KafDrop (https://github.com/HomeAdvisor/Kafdrop): Kafdrop is a UI for monitoring Apache Kafka clusters. The tool displays information such as brokers, topics, partitions, and even lets you view messages. It is a light weight application that runs on Spring Boot and requires very little configuration. 14 MONITORING TOOLS: Open source
  • 15. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 LinkedIn’s Kafka Monitor (https://github.com/linkedin/kafka-monitor): Kafka Monitor is a framework to implement and execute long-running kafka system tests in a real cluster. It complements Kafka’s existing system tests by capturing potential bugs or regressions that are only likely to occur after prolonged period of time or with low probability. Moreover, it allows you to monitor Kafka cluster using end-to-end pipelines to obtain a number of derived vital stats such as end-to-end latency, service availability and message loss rate. You can easily deploy Kafka Monitor to test and monitor your Kafka cluster without requiring any change to your application. Kafka Monitor can automatically create the monitor topic with the specified config and increase partition count of the monitor topic to ensure partition# >= broker#. It can also reassign partition and trigger preferred leader election to ensure that each broker acts as leader of at least one partition of the monitor topic. This allows Kafka Monitor to detect performance issue on every broker without requiring users to manually manage the partition assignment of the monitor topic. 15 MONITORING TOOLS: Open source
  • 16. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 LinkedIn’s Kafka Monitor (https://github.com/linkedin/kafka-monitor): 16 MONITORING TOOLS: Open source
  • 17. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 Grafana + Prometheus ( https://github.com/grafana/grafana https://github.com/prometheus/prometheus ): - Grafana: Open source, feature rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB. - Prometheus: Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true. Demo kafka repository (with slack integration): https://github.com/lucrussell/slack-chatops 17 MONITORING TOOLS: Open source
  • 18. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 Grafana + Prometheus: 18 MONITORING TOOLS: Open source
  • 19. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 19 MONITORING TOOLS: Open source ELK ( https://github.com/elastic/elasticsearch https://github.com/elastic/logstash https://github.com/elastic/kibana ): - Elasticsearh: distributed RESTful search engine built for the cloud. - Logstash: Logstash is part of the Elastic Stack along with Beats, Elasticsearch and Kibana. Logstash is a server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash." - Kibana: Window into the Elastic Stack. Specifically, it's a browser-based analytics and search dashboard for Elasticsearch.
  • 20. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 20 MONITORING TOOLS: Open source ELK :
  • 21. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 21 LANDOOP STACK Landoop Lenses: Lenses is an enterprise grade product that provides faster streaming application deliveries and data flow management that natively integrates over Apache Kafka. Lenses supports the core elements of Kafka with a rich user interface, endpoints and vital enterprise capabilities that enable engineering and data teams to query real time data, create and monitor Kafka topologies with rich integrations with other systems. Fast Data-dev: Running a demo development environment: $docker run --rm --net=host landoop/fast-data-dev
  • 22. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 22 LANDOOP STACK Landoop Lenses:
  • 23. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 23 CONFLUET STACK Confluent Platform: Streaming platform that enables you to organize and manage data from many different sources with one reliable, high performance system. Bundle: - Kafka Connectors - Kafka Clients - Schema Registry - REST Proxy Enterprise: - Automatic Data Balancing - Multi Datacenter Replication - Confluent Control Center - JMS Client
  • 24. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 24 CONFLUET STACK Confluent Control Center: Confluent Control Center is a GUI-based system for managing and monitoring Apache Kafka. It allows you to easily manage Kafka Connect, to create, edit, and manage connections to other systems. It also allows you to monitor data streams from producer to consumer, assuring that every message is delivered, and measuring how long it takes to deliver messages. Using Control Center, you can build a production data pipeline based on Apache Kafka without writing a line of code. Control Center also has the capability to define alerts on the latency and completeness statistics of data streams, which can be delivered by email or queried from a centralized alerting system.
  • 25. Oficinas en Madrid: C/ Francisco Silvela, 54 Duplicado 1ºD 28028 Telf: 91 080 82 44 Oficinas en Barcelona: C/ Madrazo 27-29 4ª 08006 Telf: 933 68 52 46 25 CONFLUET STACK Confluent Control Center: