0
11
COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE
Works on my
machine – your
problem now?
Wolfgang Gottesheim
Compuware APM
22
Business comes up with new features
33
Testing?
44
And this is what you end up with…
55
System Unresponsive?
66
What Operations tells Developers…
77
…and what Devs would like to know
88
…and what Devs would like to know
Top Contributor is related to
String handling
99% of that time comes from
RegEx Patte...
99
Attitudes like this don’t help either
Image taken from https://www.scriptrock.com/blog/devops-whats-hype-about/
1010
Very ―expensive‖ to work on these issues
~80% of problems
caused by ~20% patterns
YES we know this
80%Dev Time in Bug...
1111
1212
#1: Exhausted Resource Pools
1313
#2: Maxing out Worker Threads
The timeline shows how
these active worker
threads are distributed
across all JVMs
At ~...
1414
Root Cause:
Class Loading as Performance Hotspot
Most of the time is
spent in
CLASSLOADING
during Peak Load
But the s...
1515
Root Cause:
Trying to Load a Missing Class
Class Loading impacts ALL
transactions (fast or slow)
Class Loader tries t...
1616
#3: Deployment Mistakes
1717
Root Cause: Missing File
1818
#4: Different settings in Test & Prod
1919
#5: Real-world Data != Test Data
2020
#6: N+1 Query Problem
2121
#7: Misconfigured Caching Framework
798772 DB Calls
in 30 minutes
With NO TRAFFIC
2222
#8: Memory Leaks
Still
crashes
Problem
fixed!
Fixed Version
Deployed
2323
#9: Bloated Web Sites
17! JS Files – 1.7MB in Size
Useless Information!
Even might be a security risk!
2424
Recent example: Healthcare.gov
55 JS Files,
16 jQuery related!
Merging files can reduce
roundtrips by 95%
2525
#10: Browser caches
62! Resources not cached
49! Resources with short expiration
2626
Problems that could have been avoided
BUT WHY are they still making it to Production?
HOW can we catch them earlier?
2727
Root Cause: Disconnected Teams
28
Solution: DevOps + Performance Focus
2929
Culture Become ONE Team
3030
Culture Testability
3131
Automate & Measure …Performance
3232
Automate & Measure …Scalability
3333
Automate Deployment
3434
Share Tools
3535
How? Performance Focus in Test Automation
12 0 120ms
3 1 68ms
Build 20 testPurchase OK
testSearch OK
Build 17 testPur...
3636
How? Performance Focus in Test Automation
Embed your Architectural
Results in Jenkins
3737
Version Control System
dynaTrace
Server
Developer
CI Server
Commit
Trigger
build
Build and
run tests
Publish performa...
3838
How? Performance Focus in Test Automation
Analyzing All Unit /
Performance Tests
Analyze Perf
Metrics
Identify
Regres...
3939
How? Performance Focus in Test Automation
Cross Impact of KPIs
4040
Share Results
41
© 2011 Compuware Corporation — All Rights Reserved
Simply Smarter
Upcoming SlideShare
Loading in...5
×

Works on my machine, your problem now? - QCon 2014

154

Published on

Can you get away with that answer after crashing your production website with a change you just deployed? Usually you can’t, and instead you’re tasked with figuring out and fixing the problem. In this session, we will talk about typical architectural, coding and deployment problems you might recognize, show what data you need to quickly identify them, and how to catch them before impacting the business.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
154
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Works on my machine, your problem now? - QCon 2014"

  1. 1. 11 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE Works on my machine – your problem now? Wolfgang Gottesheim Compuware APM
  2. 2. 22 Business comes up with new features
  3. 3. 33 Testing?
  4. 4. 44 And this is what you end up with…
  5. 5. 55 System Unresponsive?
  6. 6. 66 What Operations tells Developers…
  7. 7. 77 …and what Devs would like to know
  8. 8. 88 …and what Devs would like to know Top Contributor is related to String handling 99% of that time comes from RegEx Pattern Matching Page Rendering is the main component
  9. 9. 99 Attitudes like this don’t help either Image taken from https://www.scriptrock.com/blog/devops-whats-hype-about/
  10. 10. 1010 Very ―expensive‖ to work on these issues ~80% of problems caused by ~20% patterns YES we know this 80%Dev Time in Bug Fixing $60B Defect Costs BUT
  11. 11. 1111
  12. 12. 1212 #1: Exhausted Resource Pools
  13. 13. 1313 #2: Maxing out Worker Threads The timeline shows how these active worker threads are distributed across all JVMs At ~10:10 AM almost all JVMs max out their available worker threads Detailed information for every single JVM
  14. 14. 1414 Root Cause: Class Loading as Performance Hotspot Most of the time is spent in CLASSLOADING during Peak Load But the same is true for ―normal‖ load. Classloading seems to be a general problem that is not load related
  15. 15. 1515 Root Cause: Trying to Load a Missing Class Class Loading impacts ALL transactions (fast or slow) Class Loader tries to load a class ending in TransferValidatorBPBeanInfo It’s a class that doesn’t exist
  16. 16. 1616 #3: Deployment Mistakes
  17. 17. 1717 Root Cause: Missing File
  18. 18. 1818 #4: Different settings in Test & Prod
  19. 19. 1919 #5: Real-world Data != Test Data
  20. 20. 2020 #6: N+1 Query Problem
  21. 21. 2121 #7: Misconfigured Caching Framework 798772 DB Calls in 30 minutes With NO TRAFFIC
  22. 22. 2222 #8: Memory Leaks Still crashes Problem fixed! Fixed Version Deployed
  23. 23. 2323 #9: Bloated Web Sites 17! JS Files – 1.7MB in Size Useless Information! Even might be a security risk!
  24. 24. 2424 Recent example: Healthcare.gov 55 JS Files, 16 jQuery related! Merging files can reduce roundtrips by 95%
  25. 25. 2525 #10: Browser caches 62! Resources not cached 49! Resources with short expiration
  26. 26. 2626 Problems that could have been avoided BUT WHY are they still making it to Production? HOW can we catch them earlier?
  27. 27. 2727 Root Cause: Disconnected Teams
  28. 28. 28 Solution: DevOps + Performance Focus
  29. 29. 2929 Culture Become ONE Team
  30. 30. 3030 Culture Testability
  31. 31. 3131 Automate & Measure …Performance
  32. 32. 3232 Automate & Measure …Scalability
  33. 33. 3333 Automate Deployment
  34. 34. 3434 Share Tools
  35. 35. 3535 How? Performance Focus in Test Automation 12 0 120ms 3 1 68ms Build 20 testPurchase OK testSearch OK Build 17 testPurchase OK testSearch OK Build 18 testPurchase FAILED testSearch OK Build 19 testPurchase OK testSearch OK Build # Test Case Status # SQL # Excep CPU 12 0 120ms 3 1 68ms 12 5 60ms 3 1 68ms 75 0 230ms 3 1 68ms Test Framework Results Architectural Data We identified a regresesion Problem solved Lets look behind the scenes Exceptions probably reason for failed tests Problem fixed but now we have an architectural regression Problem fixed but now we have an architectural regression Now we have the functional and architectural confidence
  36. 36. 3636 How? Performance Focus in Test Automation Embed your Architectural Results in Jenkins
  37. 37. 3737 Version Control System dynaTrace Server Developer CI Server Commit Trigger build Build and run tests Publish performance metrics Drilldown for further analysis Inform about build status Look beyond test pass/fail!
  38. 38. 3838 How? Performance Focus in Test Automation Analyzing All Unit / Performance Tests Analyze Perf Metrics Identify Regressions
  39. 39. 3939 How? Performance Focus in Test Automation Cross Impact of KPIs
  40. 40. 4040 Share Results
  41. 41. 41 © 2011 Compuware Corporation — All Rights Reserved Simply Smarter
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×