Application Performance Monitoring in Distributed Applications

•

0 likes•111 views

Sascha Rodekamp

Learnings from our way to monitor the performance of a distributed application.

Technology

Application Performance
Monitoring
in Distributed
Applications

We want to know
Runtime of certain parts of the system
Data throughput
Performance bottlenecks

Why we want to do that
Suddenly dropping throughput
Suddenly longer running jobs/requests
Exploring performance trends
See performance impact on new implementation

To achieve that
Collect Performance Metrics, Aggregate
and Visualise them
Easy in Monolithic Applications
More difficult in Distributed Applications

Distributed Applications
Metrics have to be collected from many hosts
Distributed contexts have to handled
Data have to be aggregated (right order) and visualised.
—> Distributed Tracing Systems,
first mentioned in Googles Drapper Papers
Popular implementations are OpenZipkin and Jaeger (Uber)

Let's collect some metrics
Business-, Application- and System-Metrics
Application- and System-Metrics via JMX
Business-Metrics via Code Instrumentation (DropWizard, kamon.io)

and persist the metrics
a good idea is to use a
Timeseries Database (InfluxDB, Graphite)

Visualisation is key
Make insights accessible by
visualising data and configuring alerts
(i.e. Grafana, Graphite, Chronograf)

Our System
Java Application
Consists of different independent batches
Most batches handling data
Some batches uses external asynchrones services to enrich data
(response time from seconds to weeks)
Run in an distributed environment

Our Requirements
Single Methods
Batch Runtime
Business-Process duration (spanning multiple JVMs)
Add runtime parameters to the metrics
Measure data throughput
And
Low Code Impact
Metric collection should be decoupled and not harm the system
Visualisation should be awesome

Implementation
Own metrics library with two kinds of Metrics:
Simple Metric which measures the runtime of single methods
Distributed Metric which span over multiple JVMs

+--Server-------+
| |
| +---+ +---+ |
| | Batches | +-----------------+
| +---+ +---+ | |
| +---+ +---+ | |
| | | | | | |
| +---+ +---+ | +-v--+
| | | |
+---------------+ | ActiveMQ
+---------------+ | | +-Grafana-----+
| | | | +--Consumer---+ | XXXX XX |
| +---+ +---+ | | | | | | XX XXXXX X |
| | | | | | | | | | | |
| +---+ +---+ | | | | | | XXXXX XXXXX |
| +---------------> <--------------+ | | X XXX X |
| +---+ +---+ | | | | | +--+----------+
| | | | | | | | | | |
| +---+ +---+ | | | | | |
| | | | +--+--------+-+ |
+---------------+ | | | ^ | +-InfluxDB--v-+
+---------------+ | | | | | | |
| | | | v | +--------> |
| +---+ +---+ | +-^--+ +-Local Storage | |
| | | | | | | | | | |
| +---+ +---+ +-----------------+ | | | |
| | | | | |
+---------------+ +--------+ +-------------+
Architecture

$@MeasurePoint(measurePosition = "MEASURE_ME") public void measure_me() { MeasureContext.addTag("TAG_NAME", "TAG"); } Some Code - Simple Metric$

$@MeasurePoint(measurePosition = "START_MEASURE", comprehensive = true) public void start_measure_here(long someId){ MeasureContext.addField("CONTEXT", "VALUE"); MeasureContext.setIdentifier(someId); } @MeasurePoint(measurePosition = "START_MEASURE", ending = true) public void stop_measure_here(long someId){ MeasureContext.setIdentifier(someId); } Some Code - Distributed Metric$

Learnings
There is no free lunch
Start with your Dashboard
Find the right audience
Choose the right level of measurement
You will produce lots of data
Measure as lot as you can, you don't now what you need (Coda Hale)

Links
http://kamon.io
http://metrics.dropwizard.io
https://grafana.com/
https://www.influxdata.com/
http://zipkin.io/
https://github.com/jaegertracing
Metrics, Metrics, Everywhere - Coda Hale
(https://youtu.be/czes-oa0yik)

Similar to Application Performance Monitoring in Distributed Applications

Demantra Case Study Dougsichie

Process wind tunnel - A novel capability for data-driven business process imp...Sudhendu Rai

ManageEngine ADAudit PlusMajd Khriema

SAP UTILITIES ONLINE TRAININGTRAINING ICON

Full accesspolicyconsolidation for event processing systemsviswanadhamsatish

IBM Maximo for UtilitiesVincent Kwon

Access policy consolidation for event processing systemssumit kumar

Acumatica on RetailOnRasbor.com

IDEAS Global A.I. Conference 2022.pdfManimuthu Ayyannan

Winter Simulation Conference 2021 - Process Wind Tunnel TalkSudhendu Rai

How Sap Quickviewer Helps Air Products And Chemicals With Sarbanes Oxley Comp...wendlidl

20150423 m3Kazuaki Matsuo

IRJET- Cloud Based Warehouse Management FirmIRJET Journal

Window functions in MySQL 8.0Mydbops

HYPERION PLANNING ONLINE TRAININGTRAINING ICON

Gaining Better Observability of Your VMs with Amazon CloudWatch - AWS Online ...Amazon Web Services

Gis povSougata Mitra

Solving 21st Century App Performance Problems Without 21 PeopleDynatrace

SmartCloud Monitoring and Capacity PlanningIBM Danmark

Hash join use memory optimizationICTeam S.p.A.

Similar to Application Performance Monitoring in Distributed Applications (20)

Demantra Case Study Doug

Process wind tunnel - A novel capability for data-driven business process imp...

ManageEngine ADAudit Plus

SAP UTILITIES ONLINE TRAINING

Full accesspolicyconsolidation for event processing systems

IBM Maximo for Utilities

Access policy consolidation for event processing systems

Acumatica on RetailOn

IDEAS Global A.I. Conference 2022.pdf

Winter Simulation Conference 2021 - Process Wind Tunnel Talk

How Sap Quickviewer Helps Air Products And Chemicals With Sarbanes Oxley Comp...

20150423 m3

IRJET- Cloud Based Warehouse Management Firm

Window functions in MySQL 8.0

HYPERION PLANNING ONLINE TRAINING

Gaining Better Observability of Your VMs with Amazon CloudWatch - AWS Online ...

Gis pov

Solving 21st Century App Performance Problems Without 21 People

SmartCloud Monitoring and Capacity Planning

Hash join use memory optimization

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

A Call to Action for Generative AI in 2024Results

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Slack Application Development 101 Slidespraypatel2

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Developing An App To Navigate The Roads of BrazilV3cube

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx

CNv6 Instructor Chapter 6 Quality of Service

A Call to Action for Generative AI in 2024

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Tata AIG General Insurance Company - Insurer Innovation Award 2024

08448380779 Call Girls In Civil Lines Women Seeking Men

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

🐬 The future of MySQL is Postgres 🐘

Axa Assurance Maroc - Insurer Innovation Award 2024

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Slack Application Development 101 Slides

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

Injustice - Developers Among Us (SciFiDevCon 2024)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Developing An App To Navigate The Roads of Brazil

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Application Performance Monitoring in Distributed Applications

1. Application Performance Monitoring in Distributed Applications

3. We want to know Runtime of certain parts of the system Data throughput Performance bottlenecks

4. Why we want to do that Suddenly dropping throughput Suddenly longer running jobs/requests Exploring performance trends See performance impact on new implementation

6. To achieve that Collect Performance Metrics, Aggregate and Visualise them Easy in Monolithic Applications More difficult in Distributed Applications

7. Distributed Applications Metrics have to be collected from many hosts Distributed contexts have to handled Data have to be aggregated (right order) and visualised. —> Distributed Tracing Systems, first mentioned in Googles Drapper Papers Popular implementations are OpenZipkin and Jaeger (Uber)

8. Let's collect some metrics Business-, Application- and System-Metrics Application- and System-Metrics via JMX Business-Metrics via Code Instrumentation (DropWizard, kamon.io)

9. and persist the metrics a good idea is to use a Timeseries Database (InfluxDB, Graphite)

10. Visualisation is key Make insights accessible by visualising data and configuring alerts (i.e. Grafana, Graphite, Chronograf)

11.

12. Our System Java Application Consists of different independent batches Most batches handling data Some batches uses external asynchrones services to enrich data (response time from seconds to weeks) Run in an distributed environment

13. Our Requirements Single Methods Batch Runtime Business-Process duration (spanning multiple JVMs) Add runtime parameters to the metrics Measure data throughput And Low Code Impact Metric collection should be decoupled and not harm the system Visualisation should be awesome

14. Implementation Own metrics library with two kinds of Metrics: Simple Metric which measures the runtime of single methods Distributed Metric which span over multiple JVMs

15. +--Server-------+ | | | +---+ +---+ | | | Batches | +-----------------+ | +---+ +---+ | | | +---+ +---+ | | | | | | | | | | +---+ +---+ | +-v--+ | | | | +---------------+ | ActiveMQ +---------------+ | | +-Grafana-----+ | | | | +--Consumer---+ | XXXX XX | | +---+ +---+ | | | | | | XX XXXXX X | | | | | | | | | | | | | | +---+ +---+ | | | | | | XXXXX XXXXX | | +---------------> <--------------+ | | X XXX X | | +---+ +---+ | | | | | +--+----------+ | | | | | | | | | | | | +---+ +---+ | | | | | | | | | | +--+--------+-+ | +---------------+ | | | ^ | +-InfluxDB--v-+ +---------------+ | | | | | | | | | | | v | +--------> | | +---+ +---+ | +-^--+ +-Local Storage | | | | | | | | | | | | | | +---+ +---+ +-----------------+ | | | | | | | | | | +---------------+ +--------+ +-------------+ Architecture

16. @MeasurePoint(measurePosition = "MEASURE_ME") public void measure_me() { MeasureContext.addTag("TAG_NAME", "TAG"); } Some Code - Simple Metric

17. @MeasurePoint(measurePosition = "START_MEASURE", comprehensive = true) public void start_measure_here(long someId){ MeasureContext.addField("CONTEXT", "VALUE"); MeasureContext.setIdentifier(someId); } @MeasurePoint(measurePosition = "START_MEASURE", ending = true) public void stop_measure_here(long someId){ MeasureContext.setIdentifier(someId); } Some Code - Distributed Metric

18. Learnings There is no free lunch Start with your Dashboard Find the right audience Choose the right level of measurement You will produce lots of data Measure as lot as you can, you don't now what you need (Coda Hale)

19. Links http://kamon.io http://metrics.dropwizard.io https://grafana.com/ https://www.influxdata.com/ http://zipkin.io/ https://github.com/jaegertracing Metrics, Metrics, Everywhere - Coda Hale (https://youtu.be/czes-oa0yik)

20. {DISCUSSION}

Application Performance Monitoring in Distributed Applications

Recommended

Recommended

More Related Content

Similar to Application Performance Monitoring in Distributed Applications

Similar to Application Performance Monitoring in Distributed Applications (20)

Recently uploaded

Recently uploaded (20)

Application Performance Monitoring in Distributed Applications