SlideShare a Scribd company logo
1 of 27
Micro-metrics to forecast performance
tsunamis
Ram Lakshmanan – Architect – GCeasy.io, fastThread.io, HeapHero.io
2019
About me
• Troubleshooter
Even unpredictable weather is forecasted
Can we forecast our application’s availability & performance? Even for
next 20 minutes can we forecast?
Macro-metrics
3. Memory
2. CPU Utilization
1. Response Time
Can’t forecast problems
Primary Macro-metrics that are monitored today
1: GC Throughput
2: GC Latency
3: Object Creation Rate
Garbage collection related micrometrics
Garbage collection log analysis
Real world problem 1: OutOfMemoryError
Happened in major a financial institution
Repeated Full GCs happens here
OutOfMemoryError happens here
99.75%
1. Garbage Collection Throughput
Productive work vs non-productive work
Percentage of time spent in processing customer transactions vs time
spent in GC activity.
Previous release
2. Garbage Collection Latency
Amount of time application is paused for doing Garbage Collection
100 mb/sec 300 mb/sec
3. Object Creation rate
Amount of objects created in a unit of time
Enable Garbage Collection Logs
Till Java 8:
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<file-path>
From Java 9:
-Xlog:gc*:file=<file-path>
How to source memory related micrometrics?
Tools
GCeasy.io
IBM Pattern Modeling and Analysis Tool for Java Garbage Collector
HP Jmeter
Google Garbage Cat
API
https://blog.gceasy.io/2016/06/18/garbage-collection-log-analysis-api/
1. GC Throughput
2. GC Latency
3. Object Creation rate
1: Thread States & Count
2: Thread Groups
3: Thread Execution Patterns
Thread related micrometrics
Real world problem 2: Application became unresponsive
Happened in major financial institution
Real world problem 3: Response time started to degrade
Happened in major travel company
1. Thread Count & Thread States
Number of Threads and their states
2. Thread Groups
Thread groups and their size
3. Thread with same execution patterns
Concentration of threads with same stack trace may be indication of brewing problem
How to capture thread dumps?
jstack
jmc
jcmd
https://blog.fastthread.io/2016/06/06/how-to-take-thread-dumps-7-options/
How to source thread related micrometrics?
Tools
fastThread.io
Custom Scripting
API
https://blog.fastthread.io/2016/10/27/thread-dump-analysis-api/
4. Thread count &
States
2. GC Latency
3. GC Throughput
1. Thread states
2. Thread Groups
3. Thread execution
patterns
1: TCP/IP Connections
2: TCP/IP state
3: Open File Descriptors
Network related micrometrics
1. TCP/IP connections
Number of established connection count by host
$ netstat -an | grep ESTABLISHED | grep ‘{HOST}' | wc –l
$ netstat -an | grep ESTABLISHED | grep '162.187.223.11' | wc -l
500
TCP/IP connections
2000
TCP/IP connections
1: LISTEN
2: SYN-SENT
3: SYN-RECEIVED
2. TCP/IP states
Number of connection by each state
4: ESTABLISHED
5: FIN-WAIT-1
6: FIN-WAIT-2
7: CLOSE-WAIT
8: CLOSING
9: LAST-ACK8: TIME_WAIT
10: CLOSED
$ netstat -an | grep 'TIME_WAIT' | wc -l
3. File Descriptors
Open File Handles, Network Connections, Pipes
lsof -p {PROCESS-ID} | wc -l
lsof -p 5666 | wc -l
500
File Descriptors
2000
File Descriptors
1: IOPS
2: Storage Throughput
3: Storage Latency
Storage related micrometrics
Each IO request will take some time to complete, this is called the average latency.
This latency is measured in milliseconds and should be as low as possible.
Storage throughput is basically common value that shows how much a storage can
deliver. This number is usually expressed in Megabytes / Second (MB/s), example
140 MB/s.
IO operations per second, which means the amount of read or write operations that
could be done in one second time.
1: DB – Locks
2: Long running queries
3: Evictions of in-memory tables
DB related micrometrics
4: Hits/miss ratio
CI/CD
Performance Tests
Production
1. CI/CD
Fail the build if any of
the micrometrics
thresholds are breached
2. Performance Tests
Examine micrometrics in
each release
3. Production
You can use this
micrometrics for
production monitoring
Where?
Where to use these metrics?
How Tsunamis are detected?
Deep-ocean Assessment and Reporting of Tsunamis DART II
6000m = 20 x Eiffel tower 5 cm = height of credit card
6000m
Even if 5 cm change, alerts triggered
Thank You
White Paper:
https://blog.gceasy.io/2019/03/13/micrometrics-to-forecast-application-performance/
ram@tier1app.com
+1.415.948.5431
@tier1app

More Related Content

What's hot

How & why-memory-efficient?
How & why-memory-efficient?How & why-memory-efficient?
How & why-memory-efficient?Tier1 app
 
GC Tuning & Troubleshooting Crash Course
GC Tuning & Troubleshooting Crash CourseGC Tuning & Troubleshooting Crash Course
GC Tuning & Troubleshooting Crash CourseTier1 app
 
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Tier1app
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applicationsTier1 app
 
Become a Garbage Collection Hero
Become a Garbage Collection HeroBecome a Garbage Collection Hero
Become a Garbage Collection HeroTier1app
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC HeroTier1app
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumpsTier1app
 
Gc crash course (1)
Gc crash course (1)Gc crash course (1)
Gc crash course (1)Tier1 app
 
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupДоклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupBadoo Development
 
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumNine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumAlexey Lesovsky
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterAlexey Lesovsky
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedAlexey Lesovsky
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).Alexey Lesovsky
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++Dimitrios Platis
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterAlexey Lesovsky
 
Jdk Tools For Performance Diagnostics
Jdk Tools For Performance DiagnosticsJdk Tools For Performance Diagnostics
Jdk Tools For Performance DiagnosticsDror Bereznitsky
 
JDK not so hidden treasures
JDK not so hidden treasuresJDK not so hidden treasures
JDK not so hidden treasuresAndrzej Grzesik
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...Ontico
 

What's hot (20)

How & why-memory-efficient?
How & why-memory-efficient?How & why-memory-efficient?
How & why-memory-efficient?
 
GC Tuning & Troubleshooting Crash Course
GC Tuning & Troubleshooting Crash CourseGC Tuning & Troubleshooting Crash Course
GC Tuning & Troubleshooting Crash Course
 
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
 
Lets crash-applications
Lets crash-applicationsLets crash-applications
Lets crash-applications
 
Become a Garbage Collection Hero
Become a Garbage Collection HeroBecome a Garbage Collection Hero
Become a Garbage Collection Hero
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC Hero
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumps
 
Gc crash course (1)
Gc crash course (1)Gc crash course (1)
Gc crash course (1)
 
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupДоклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
 
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL VacuumNine Circles of Inferno or Explaining the PostgreSQL Vacuum
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
 
Principios básicos de Garbage Collector en Java
Principios básicos de Garbage Collector en JavaPrincipios básicos de Garbage Collector en Java
Principios básicos de Garbage Collector en Java
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++
 
JEEConf. Vanilla java
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
Jdk Tools For Performance Diagnostics
Jdk Tools For Performance DiagnosticsJdk Tools For Performance Diagnostics
Jdk Tools For Performance Diagnostics
 
JDK not so hidden treasures
JDK not so hidden treasuresJDK not so hidden treasures
JDK not so hidden treasures
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
 

Similar to Micro-metrics to forecast performance tsunamis

predicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptxpredicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptxTier1 app
 
Micrometrics to forecast performance tsunamis
Micrometrics to forecast performance tsunamisMicrometrics to forecast performance tsunamis
Micrometrics to forecast performance tsunamisTier1app
 
5th KuVS Meeting
5th KuVS Meeting5th KuVS Meeting
5th KuVS Meetingsteccami
 
Loophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in ChromeLoophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in Chromecgvwzq
 
Samza at LinkedIn
Samza at LinkedInSamza at LinkedIn
Samza at LinkedInVenu Ryali
 
PyConUK 2018 - Journey from HTTP to gRPC
PyConUK 2018 - Journey from HTTP to gRPCPyConUK 2018 - Journey from HTTP to gRPC
PyConUK 2018 - Journey from HTTP to gRPCTatiana Al-Chueyr
 
Primer to Browser Netwroking
Primer to Browser NetwrokingPrimer to Browser Netwroking
Primer to Browser NetwrokingShuya Osaki
 
Tutorial mikrotik step by step anung muhandanu
Tutorial mikrotik step by step  anung muhandanu Tutorial mikrotik step by step  anung muhandanu
Tutorial mikrotik step by step anung muhandanu Alessandro De Suoodh
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving ForwardON.Lab
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzaAbhishek Shivanna
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuningMongoDB
 
Leveraging open source tools to gain insight into OpenStack Swift
Leveraging open source tools to gain insight into OpenStack SwiftLeveraging open source tools to gain insight into OpenStack Swift
Leveraging open source tools to gain insight into OpenStack SwiftDmitry Sotnikov
 
Timing attacks against web applications: Are they still practical?
Timing attacks against web applications: Are they still practical?Timing attacks against web applications: Are they still practical?
Timing attacks against web applications: Are they still practical?DefCamp
 
Chapter2 l2 modified_um
Chapter2 l2 modified_umChapter2 l2 modified_um
Chapter2 l2 modified_umSajid Baloch
 
JDO 2019: Service mesh with Istio - Mariusz Gil
JDO 2019: Service mesh with Istio - Mariusz GilJDO 2019: Service mesh with Istio - Mariusz Gil
JDO 2019: Service mesh with Istio - Mariusz GilPROIDEA
 
Leveraging the power of SolrCloud and Spark with OpenShift
Leveraging the power of SolrCloud and Spark with OpenShiftLeveraging the power of SolrCloud and Spark with OpenShift
Leveraging the power of SolrCloud and Spark with OpenShiftQAware GmbH
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
Chat app case study - xmpp vs SIP
Chat app case study - xmpp vs SIPChat app case study - xmpp vs SIP
Chat app case study - xmpp vs SIPGenora Infotech
 
Guidelhhghghine document final
Guidelhhghghine document finalGuidelhhghghine document final
Guidelhhghghine document finalnanirao686
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 

Similar to Micro-metrics to forecast performance tsunamis (20)

predicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptxpredicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptx
 
Micrometrics to forecast performance tsunamis
Micrometrics to forecast performance tsunamisMicrometrics to forecast performance tsunamis
Micrometrics to forecast performance tsunamis
 
5th KuVS Meeting
5th KuVS Meeting5th KuVS Meeting
5th KuVS Meeting
 
Loophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in ChromeLoophole: Timing Attacks on Shared Event Loops in Chrome
Loophole: Timing Attacks on Shared Event Loops in Chrome
 
Samza at LinkedIn
Samza at LinkedInSamza at LinkedIn
Samza at LinkedIn
 
PyConUK 2018 - Journey from HTTP to gRPC
PyConUK 2018 - Journey from HTTP to gRPCPyConUK 2018 - Journey from HTTP to gRPC
PyConUK 2018 - Journey from HTTP to gRPC
 
Primer to Browser Netwroking
Primer to Browser NetwrokingPrimer to Browser Netwroking
Primer to Browser Netwroking
 
Tutorial mikrotik step by step anung muhandanu
Tutorial mikrotik step by step  anung muhandanu Tutorial mikrotik step by step  anung muhandanu
Tutorial mikrotik step by step anung muhandanu
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving Forward
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samza
 
1404 app dev series - session 8 - monitoring & performance tuning
1404   app dev series - session 8 - monitoring & performance tuning1404   app dev series - session 8 - monitoring & performance tuning
1404 app dev series - session 8 - monitoring & performance tuning
 
Leveraging open source tools to gain insight into OpenStack Swift
Leveraging open source tools to gain insight into OpenStack SwiftLeveraging open source tools to gain insight into OpenStack Swift
Leveraging open source tools to gain insight into OpenStack Swift
 
Timing attacks against web applications: Are they still practical?
Timing attacks against web applications: Are they still practical?Timing attacks against web applications: Are they still practical?
Timing attacks against web applications: Are they still practical?
 
Chapter2 l2 modified_um
Chapter2 l2 modified_umChapter2 l2 modified_um
Chapter2 l2 modified_um
 
JDO 2019: Service mesh with Istio - Mariusz Gil
JDO 2019: Service mesh with Istio - Mariusz GilJDO 2019: Service mesh with Istio - Mariusz Gil
JDO 2019: Service mesh with Istio - Mariusz Gil
 
Leveraging the power of SolrCloud and Spark with OpenShift
Leveraging the power of SolrCloud and Spark with OpenShiftLeveraging the power of SolrCloud and Spark with OpenShift
Leveraging the power of SolrCloud and Spark with OpenShift
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Chat app case study - xmpp vs SIP
Chat app case study - xmpp vs SIPChat app case study - xmpp vs SIP
Chat app case study - xmpp vs SIP
 
Guidelhhghghine document final
Guidelhhghghine document finalGuidelhhghghine document final
Guidelhhghghine document final
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 

More from Tier1 app

Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Top-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTier1 app
 
Top-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptxTop-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptxTier1 app
 
predicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptxpredicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptxTier1 app
 
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...Tier1 app
 
7-JVM-arguments-JaxLondon-2023.pptx
7-JVM-arguments-JaxLondon-2023.pptx7-JVM-arguments-JaxLondon-2023.pptx
7-JVM-arguments-JaxLondon-2023.pptxTier1 app
 
Top-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptxTop-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptxTier1 app
 
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLETier1 app
 
MAJOR OUTAGES IN MAJOR ENTERPRISES
MAJOR OUTAGES IN MAJOR ENTERPRISESMAJOR OUTAGES IN MAJOR ENTERPRISES
MAJOR OUTAGES IN MAJOR ENTERPRISESTier1 app
 
KnowAPIs-UnknownPerf-confoo-2023 (1).pptx
KnowAPIs-UnknownPerf-confoo-2023 (1).pptxKnowAPIs-UnknownPerf-confoo-2023 (1).pptx
KnowAPIs-UnknownPerf-confoo-2023 (1).pptxTier1 app
 
memory-patterns-confoo-2023.pptx
memory-patterns-confoo-2023.pptxmemory-patterns-confoo-2023.pptx
memory-patterns-confoo-2023.pptxTier1 app
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxTier1 app
 
millions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxmillions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxTier1 app
 
lets-crash-apps-jax-2022.pptx
lets-crash-apps-jax-2022.pptxlets-crash-apps-jax-2022.pptx
lets-crash-apps-jax-2022.pptxTier1 app
 
‘16 artifacts’ to capture when there is a production problem
‘16 artifacts’ to capture when there is a production problem‘16 artifacts’ to capture when there is a production problem
‘16 artifacts’ to capture when there is a production problemTier1 app
 
Jvm internals-1-slide
Jvm internals-1-slideJvm internals-1-slide
Jvm internals-1-slideTier1 app
 
Accelerating Incident Response To Production Outages
Accelerating Incident Response To Production OutagesAccelerating Incident Response To Production Outages
Accelerating Incident Response To Production OutagesTier1 app
 
Top feedbacks
Top feedbacksTop feedbacks
Top feedbacksTier1 app
 
Shooting the troubles: Crashes, Slowdowns, CPU Spikes
Shooting the troubles: Crashes, Slowdowns, CPU SpikesShooting the troubles: Crashes, Slowdowns, CPU Spikes
Shooting the troubles: Crashes, Slowdowns, CPU SpikesTier1 app
 

More from Tier1 app (19)

Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Top-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptx
 
Top-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptxTop-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptx
 
predicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptxpredicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptx
 
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
 
7-JVM-arguments-JaxLondon-2023.pptx
7-JVM-arguments-JaxLondon-2023.pptx7-JVM-arguments-JaxLondon-2023.pptx
7-JVM-arguments-JaxLondon-2023.pptx
 
Top-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptxTop-5-Performance-JaxLondon-2023.pptx
Top-5-Performance-JaxLondon-2023.pptx
 
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
 
MAJOR OUTAGES IN MAJOR ENTERPRISES
MAJOR OUTAGES IN MAJOR ENTERPRISESMAJOR OUTAGES IN MAJOR ENTERPRISES
MAJOR OUTAGES IN MAJOR ENTERPRISES
 
KnowAPIs-UnknownPerf-confoo-2023 (1).pptx
KnowAPIs-UnknownPerf-confoo-2023 (1).pptxKnowAPIs-UnknownPerf-confoo-2023 (1).pptx
KnowAPIs-UnknownPerf-confoo-2023 (1).pptx
 
memory-patterns-confoo-2023.pptx
memory-patterns-confoo-2023.pptxmemory-patterns-confoo-2023.pptx
memory-patterns-confoo-2023.pptx
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptx
 
millions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxmillions-gc-jax-2022.pptx
millions-gc-jax-2022.pptx
 
lets-crash-apps-jax-2022.pptx
lets-crash-apps-jax-2022.pptxlets-crash-apps-jax-2022.pptx
lets-crash-apps-jax-2022.pptx
 
‘16 artifacts’ to capture when there is a production problem
‘16 artifacts’ to capture when there is a production problem‘16 artifacts’ to capture when there is a production problem
‘16 artifacts’ to capture when there is a production problem
 
Jvm internals-1-slide
Jvm internals-1-slideJvm internals-1-slide
Jvm internals-1-slide
 
Accelerating Incident Response To Production Outages
Accelerating Incident Response To Production OutagesAccelerating Incident Response To Production Outages
Accelerating Incident Response To Production Outages
 
Top feedbacks
Top feedbacksTop feedbacks
Top feedbacks
 
Shooting the troubles: Crashes, Slowdowns, CPU Spikes
Shooting the troubles: Crashes, Slowdowns, CPU SpikesShooting the troubles: Crashes, Slowdowns, CPU Spikes
Shooting the troubles: Crashes, Slowdowns, CPU Spikes
 

Recently uploaded

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Production of Monoclonal Antibodies by Hybridoma Technology.pptx
Production of Monoclonal Antibodies by Hybridoma Technology.pptxProduction of Monoclonal Antibodies by Hybridoma Technology.pptx
Production of Monoclonal Antibodies by Hybridoma Technology.pptxAnupkumar Sharma
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 

Recently uploaded (20)

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
week 1 cookery 8 fourth - quarter .pptx
week 1 cookery 8  fourth  -  quarter .pptxweek 1 cookery 8  fourth  -  quarter .pptx
week 1 cookery 8 fourth - quarter .pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Production of Monoclonal Antibodies by Hybridoma Technology.pptx
Production of Monoclonal Antibodies by Hybridoma Technology.pptxProduction of Monoclonal Antibodies by Hybridoma Technology.pptx
Production of Monoclonal Antibodies by Hybridoma Technology.pptx
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 

Micro-metrics to forecast performance tsunamis

  • 1. Micro-metrics to forecast performance tsunamis Ram Lakshmanan – Architect – GCeasy.io, fastThread.io, HeapHero.io 2019
  • 3. Even unpredictable weather is forecasted Can we forecast our application’s availability & performance? Even for next 20 minutes can we forecast?
  • 4. Macro-metrics 3. Memory 2. CPU Utilization 1. Response Time Can’t forecast problems Primary Macro-metrics that are monitored today
  • 5. 1: GC Throughput 2: GC Latency 3: Object Creation Rate Garbage collection related micrometrics
  • 7. Real world problem 1: OutOfMemoryError Happened in major a financial institution Repeated Full GCs happens here OutOfMemoryError happens here
  • 8. 99.75% 1. Garbage Collection Throughput Productive work vs non-productive work Percentage of time spent in processing customer transactions vs time spent in GC activity.
  • 9. Previous release 2. Garbage Collection Latency Amount of time application is paused for doing Garbage Collection
  • 10. 100 mb/sec 300 mb/sec 3. Object Creation rate Amount of objects created in a unit of time
  • 11. Enable Garbage Collection Logs Till Java 8: -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<file-path> From Java 9: -Xlog:gc*:file=<file-path> How to source memory related micrometrics? Tools GCeasy.io IBM Pattern Modeling and Analysis Tool for Java Garbage Collector HP Jmeter Google Garbage Cat API https://blog.gceasy.io/2016/06/18/garbage-collection-log-analysis-api/ 1. GC Throughput 2. GC Latency 3. Object Creation rate
  • 12. 1: Thread States & Count 2: Thread Groups 3: Thread Execution Patterns Thread related micrometrics
  • 13. Real world problem 2: Application became unresponsive Happened in major financial institution
  • 14. Real world problem 3: Response time started to degrade Happened in major travel company
  • 15. 1. Thread Count & Thread States Number of Threads and their states
  • 16. 2. Thread Groups Thread groups and their size
  • 17. 3. Thread with same execution patterns Concentration of threads with same stack trace may be indication of brewing problem
  • 18. How to capture thread dumps? jstack jmc jcmd https://blog.fastthread.io/2016/06/06/how-to-take-thread-dumps-7-options/ How to source thread related micrometrics? Tools fastThread.io Custom Scripting API https://blog.fastthread.io/2016/10/27/thread-dump-analysis-api/ 4. Thread count & States 2. GC Latency 3. GC Throughput 1. Thread states 2. Thread Groups 3. Thread execution patterns
  • 19. 1: TCP/IP Connections 2: TCP/IP state 3: Open File Descriptors Network related micrometrics
  • 20. 1. TCP/IP connections Number of established connection count by host $ netstat -an | grep ESTABLISHED | grep ‘{HOST}' | wc –l $ netstat -an | grep ESTABLISHED | grep '162.187.223.11' | wc -l 500 TCP/IP connections 2000 TCP/IP connections
  • 21. 1: LISTEN 2: SYN-SENT 3: SYN-RECEIVED 2. TCP/IP states Number of connection by each state 4: ESTABLISHED 5: FIN-WAIT-1 6: FIN-WAIT-2 7: CLOSE-WAIT 8: CLOSING 9: LAST-ACK8: TIME_WAIT 10: CLOSED $ netstat -an | grep 'TIME_WAIT' | wc -l
  • 22. 3. File Descriptors Open File Handles, Network Connections, Pipes lsof -p {PROCESS-ID} | wc -l lsof -p 5666 | wc -l 500 File Descriptors 2000 File Descriptors
  • 23. 1: IOPS 2: Storage Throughput 3: Storage Latency Storage related micrometrics Each IO request will take some time to complete, this is called the average latency. This latency is measured in milliseconds and should be as low as possible. Storage throughput is basically common value that shows how much a storage can deliver. This number is usually expressed in Megabytes / Second (MB/s), example 140 MB/s. IO operations per second, which means the amount of read or write operations that could be done in one second time.
  • 24. 1: DB – Locks 2: Long running queries 3: Evictions of in-memory tables DB related micrometrics 4: Hits/miss ratio
  • 25. CI/CD Performance Tests Production 1. CI/CD Fail the build if any of the micrometrics thresholds are breached 2. Performance Tests Examine micrometrics in each release 3. Production You can use this micrometrics for production monitoring Where? Where to use these metrics?
  • 26. How Tsunamis are detected? Deep-ocean Assessment and Reporting of Tsunamis DART II 6000m = 20 x Eiffel tower 5 cm = height of credit card 6000m Even if 5 cm change, alerts triggered

Editor's Notes

  1. ToDo: Need to come up with better image
  2. https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTkvMDMvMTIvLS1taWNyb21ldHJpY3NUb0ZvcmVjYXN0LnppcC0tMTQtMjgtMzI= Unresponsive Big Data: https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTkvMDMvMTIvLS1FVFRQQkU4LVRocmVhZGR1bXAtMTZfMTJfMTRfMTZfMzEuemlwLS0xNC0zMS01MQ==
  3. Transitive Blocks: https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTkvMDMvMTIvLS10cmFuc2l0aXZlLWJsb2Nrcy0xLnppcC0tMTQtMzMtMzU=