SlideShare a Scribd company logo
1 of 54
Download to read offline
Continuous Profiling in
Production: What, Why and How
Richard Warburton (@richardwarburto)
Sadiq Jaffer (@sadiqj)
https://www.opsian.com
Why Performance Tools Matter
Development isn’t Production
Profiling vs Monitoring
Continuous Profiling
Conclusion
Known Knowns
Known Unknowns
Unknown Unknowns
Why Performance Tools Matter
Development isn’t Production
Profiling vs Monitoring
Continuous Profiling
Conclusion
Development isn’t Production
Performance testing in development can be easier
Tooling often desktop-based
So why not performance test in a development environment?
Unrepresentative Hardware
vs
Unrepresentative Software
Unrepresentative Workloads
vs
The JVM may have very different behaviour in production
Hotspot does adaptive optimisation
Production may optimise differently
Why Performance Tools Matter
Development isn’t Production
Profiling vs Monitoring
Continuous Profiling
Conclusion
Ambient/Passive/System Metrics
Preconfigured Numerical Measure
CPU Time Usage / Page-load Times
Cheap and sometimes effective
Logging
Records arbitrary events emitted by the system being monitored
log4j/slf4j/logback
Logs of GC events
Often manual, aids system understanding, expensive
Coarse Grained Instrumentation
Measures time within some instrumented section of the code
Time spent inside the controller layer of your web-app or performing SQL queries
More detailed and actionable though expensive
Production Profiling
What methods use up CPU time?
What lines of code allocate the most objects?
Where are your CPU Cache misses coming from?
Automatic, can be cheap but often isn’t
Where Instrumentation can be blind in the Real World
Problem: every 5 seconds an HTTP endpoint would be really slow.
Instrumentation: on the servlet request, didn’t even show the pause!
Cause: Tomcat expired its resources cache every 5 seconds, on load one resource
scanned the entire classpath
Surely a better way?
Not just Metrics - Actionable Insights
Diagnostics aren’t Diagnosis
What about Profiling?
Why Performance Tools Matter
Development isn’t Production
Profiling vs Monitoring
Continuous Profiling
Conclusion
How to use Continuous Profilers
1) Extract relevant time period and apps/machines
2) Choose a type of profile: CPU Time/Wallclock Time/Memory
3) View results to tell you what the dominant consumer of a resource is
4) Fix biggest bottleneck
5) Deploy / Iterate
CPU Time vs Wallclock Time
You need both CPU Time and Wallclock Time
CPU - Diagnose expensive computational hotspots and inefficient algorithms
Spot code that should not be executing but is ...
Wallclock - Diagnose blocking that stops CPU usage
e.g blocking on external IO and lock contention issues
Profiling Hotspots
Profiling Flamegraphs
Instrumenting Profilers
Add instructions to collect timings (Eg: JVisualVM Profiler)
Inaccurate - modifies the behaviour of the program
High Overhead - > 2x slower
Sampling/Statistical Profilers
WebServerThread.run()
Controller.doSomething() Controller.next()
Repo.readPerson()
new Person()
View.printHtml() ??? ???
Safepoints
Mechanism for bringing Java application threads to a halt
Safepoint polls added to compiled code read known memory location
Protecting memory page triggers a segfault and suspends threads
Safepoint Bias
WebServerThread.run
Controller.doSomething Controller.next()
Repo.readPerson
new Person
View.printHtml ???
Safepoint Bias after Inlining
Repo.readPerson
new Person
View.printHtml ??? ???
WebServerThread.run
Controller.doSomething Controller.next()
Time to Safepoint
-XX:+PrintSafepointStatistics
Threads
Safepoint poll
VMOperation
Statistical Profiling in Java
Problem: getAllStackTraces is expensive to do frequently and inaccurate, also only
gives us Wallclock time
Need ways to:
1. Interrupt application
2. Sample resource of interest
Advanced Statistical Profiling in Java
● Interrupt with OS signals
○ Delivered to handler on only one thread
○ Lightweight
● Sample resource of interest
○ Use AsyncGetCallTrace to sample stack
○ Examine JVM internals for other resources
Advanced Statistical Profiling in Java
Approach not used by existing profilers (VisualVM and desktop commercial
alternatives)
Can give very low overheads (<1%) for reasonable sampling rates
Low Overhead Ad-Hoc Production Profilers
● Async-profiler
● Honest profiler
● Perf + perf-mapper-agent
● Java Flight Recorder
● Solaris Studio
People are put off by practical as
much as technical issues
Barriers to Ad-Hoc Production Profiling
Generally requires access to
production
Process involves manual work - hard
to automate
Low-overhead open source profilers
without commercial support
What if we profiled all the time?
Historical Data
Allows for post-hoc incident analysis
Enables correlation with other data/metrics
Performance regression analysis
Putting Samples in Context
Application version
Environment parameters (machine type, CPU, location, etc.)
Ad-hoc profiling we can’t do this
Google-wide profiling
Article: Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers
Profiling data and binaries collected, processed and made available for
browser-based reporting
“The system has been actively profiling nearly all machines at Google for several
years”
https://ai.google/research/pubs/pub36575
How to do Production Profiling
Continuous Profiling
Aggregation
service
Web Reports
JVM Agents
Summary
It’s possible to profile in production with low overhead
To overcome practical issues we can profile production all the time
We gain new capabilities by profiling all the time
Why Performance Tools Matter
Development isn’t Production
Profiling vs Monitoring
Continuous Profiling
Conclusion
With Thanks To
Brendan Greg for Flamegraphs
Andre Pangin for Async-Profiler
Jeremy Manson for the original Lightweight Profiler
Johannes Rudolph for Java Perf Mapper Agent
Nitsan Wakart
The Solaris Studio and Mission Control Teams
Performance Matters
Development isn’t Production
Metrics can be unactionable
Instrumentation has high overhead
Continuous Profiling provides insight
We need an attitude shift on profiling
+ monitoring
ContinuousProactive
not Reactive
Systematic
not Ad Hoc
Please do Production Profiling.
All the time.
Any Questions?
https://www.opsian.com/
"It's like a magnifying glass for our servers"
The End

More Related Content

What's hot

Avoiding Performance Problems: When and How to Debug Production
Avoiding Performance Problems: When and How to Debug ProductionAvoiding Performance Problems: When and How to Debug Production
Avoiding Performance Problems: When and How to Debug Production
AppNeta
 
Automated testing
Automated testingAutomated testing
Automated testing
s0194975
 

What's hot (7)

Why Software Test Performance Matters
Why Software Test Performance MattersWhy Software Test Performance Matters
Why Software Test Performance Matters
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠
 
Avoiding Performance Problems: When and How to Debug Production
Avoiding Performance Problems: When and How to Debug ProductionAvoiding Performance Problems: When and How to Debug Production
Avoiding Performance Problems: When and How to Debug Production
 
Automated testing
Automated testingAutomated testing
Automated testing
 
Continuous delivery - takeaways
Continuous delivery - takeawaysContinuous delivery - takeaways
Continuous delivery - takeaways
 
Airbag Development Process Using HyperWorks Automation - Autoliv
Airbag Development Process Using HyperWorks Automation - AutolivAirbag Development Process Using HyperWorks Automation - Autoliv
Airbag Development Process Using HyperWorks Automation - Autoliv
 

Similar to Production profiling what, why and how technical audience (3)

Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
CIVEL Benoit
 
Optimization In Mobile Systems
Optimization In Mobile SystemsOptimization In Mobile Systems
Optimization In Mobile Systems
momobangalore
 
Software Testing: Application And Script Independent Automation Framework: Th...
Software Testing: Application And Script Independent Automation Framework: Th...Software Testing: Application And Script Independent Automation Framework: Th...
Software Testing: Application And Script Independent Automation Framework: Th...
guest0efb5e
 
Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]
Munirathnam Naidu
 
HP Quick Test Professional
HP Quick Test ProfessionalHP Quick Test Professional
HP Quick Test Professional
Vitaliy Ganzha
 

Similar to Production profiling what, why and how technical audience (3) (20)

Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance Testing
 
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
Monitoring and Instrumentation Strategies: Tips and Best Practices - AppSphere16
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
Fundamentals Performance Testing
Fundamentals Performance TestingFundamentals Performance Testing
Fundamentals Performance Testing
 
Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahu
 
Test Automation for QTP
Test Automation for QTPTest Automation for QTP
Test Automation for QTP
 
Test Automation
Test AutomationTest Automation
Test Automation
 
Plab system owners meeting v2
Plab   system owners meeting v2Plab   system owners meeting v2
Plab system owners meeting v2
 
Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...
Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...
Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...
 
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
Abstractions for managed stream processing platform (Arya Ketan - Flipkart)
 
Optimization In Mobile Systems
Optimization In Mobile SystemsOptimization In Mobile Systems
Optimization In Mobile Systems
 
Dot Net Application Monitoring
Dot Net Application MonitoringDot Net Application Monitoring
Dot Net Application Monitoring
 
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things BetterTaking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
 
Software Testing: Application And Script Independent Automation Framework: Th...
Software Testing: Application And Script Independent Automation Framework: Th...Software Testing: Application And Script Independent Automation Framework: Th...
Software Testing: Application And Script Independent Automation Framework: Th...
 
Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]
 
Performance Analysis of Idle Programs
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programs
 
HP Quick Test Professional
HP Quick Test ProfessionalHP Quick Test Professional
HP Quick Test Professional
 

More from RichardWarburton

More from RichardWarburton (20)

Java collections the force awakens
Java collections  the force awakensJava collections  the force awakens
Java collections the force awakens
 
Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)
 
Collections forceawakens
Collections forceawakensCollections forceawakens
Collections forceawakens
 
Generics past, present and future
Generics  past, present and futureGenerics  past, present and future
Generics past, present and future
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
 
How to run a hackday
How to run a hackdayHow to run a hackday
How to run a hackday
 
Generics Past, Present and Future
Generics Past, Present and FutureGenerics Past, Present and Future
Generics Past, Present and Future
 
Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional Programming
 
Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Simplifying java with lambdas (short)
Simplifying java with lambdas (short)Simplifying java with lambdas (short)
Simplifying java with lambdas (short)
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
The Bleeding Edge
The Bleeding EdgeThe Bleeding Edge
The Bleeding Edge
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakes
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Recently uploaded (20)

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 

Production profiling what, why and how technical audience (3)