Your SlideShare is downloading. ×
Works on my machine, your problem now? - QCon 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Works on my machine, your problem now? - QCon 2014

125
views

Published on

Can you get away with that answer after crashing your production website with a change you just deployed? Usually you can’t, and instead you’re tasked with figuring out and fixing the problem. In this …

Can you get away with that answer after crashing your production website with a change you just deployed? Usually you can’t, and instead you’re tasked with figuring out and fixing the problem. In this session, we will talk about typical architectural, coding and deployment problems you might recognize, show what data you need to quickly identify them, and how to catch them before impacting the business.

Published in: Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
125
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 11 COMPANY CONFIDENTIAL – DO NOT DISTRIBUTE Works on my machine – your problem now? Wolfgang Gottesheim Compuware APM
  • 2. 22 Business comes up with new features
  • 3. 33 Testing?
  • 4. 44 And this is what you end up with…
  • 5. 55 System Unresponsive?
  • 6. 66 What Operations tells Developers…
  • 7. 77 …and what Devs would like to know
  • 8. 88 …and what Devs would like to know Top Contributor is related to String handling 99% of that time comes from RegEx Pattern Matching Page Rendering is the main component
  • 9. 99 Attitudes like this don’t help either Image taken from https://www.scriptrock.com/blog/devops-whats-hype-about/
  • 10. 1010 Very ―expensive‖ to work on these issues ~80% of problems caused by ~20% patterns YES we know this 80%Dev Time in Bug Fixing $60B Defect Costs BUT
  • 11. 1111
  • 12. 1212 #1: Exhausted Resource Pools
  • 13. 1313 #2: Maxing out Worker Threads The timeline shows how these active worker threads are distributed across all JVMs At ~10:10 AM almost all JVMs max out their available worker threads Detailed information for every single JVM
  • 14. 1414 Root Cause: Class Loading as Performance Hotspot Most of the time is spent in CLASSLOADING during Peak Load But the same is true for ―normal‖ load. Classloading seems to be a general problem that is not load related
  • 15. 1515 Root Cause: Trying to Load a Missing Class Class Loading impacts ALL transactions (fast or slow) Class Loader tries to load a class ending in TransferValidatorBPBeanInfo It’s a class that doesn’t exist
  • 16. 1616 #3: Deployment Mistakes
  • 17. 1717 Root Cause: Missing File
  • 18. 1818 #4: Different settings in Test & Prod
  • 19. 1919 #5: Real-world Data != Test Data
  • 20. 2020 #6: N+1 Query Problem
  • 21. 2121 #7: Misconfigured Caching Framework 798772 DB Calls in 30 minutes With NO TRAFFIC
  • 22. 2222 #8: Memory Leaks Still crashes Problem fixed! Fixed Version Deployed
  • 23. 2323 #9: Bloated Web Sites 17! JS Files – 1.7MB in Size Useless Information! Even might be a security risk!
  • 24. 2424 Recent example: Healthcare.gov 55 JS Files, 16 jQuery related! Merging files can reduce roundtrips by 95%
  • 25. 2525 #10: Browser caches 62! Resources not cached 49! Resources with short expiration
  • 26. 2626 Problems that could have been avoided BUT WHY are they still making it to Production? HOW can we catch them earlier?
  • 27. 2727 Root Cause: Disconnected Teams
  • 28. 28 Solution: DevOps + Performance Focus
  • 29. 2929 Culture Become ONE Team
  • 30. 3030 Culture Testability
  • 31. 3131 Automate & Measure …Performance
  • 32. 3232 Automate & Measure …Scalability
  • 33. 3333 Automate Deployment
  • 34. 3434 Share Tools
  • 35. 3535 How? Performance Focus in Test Automation 12 0 120ms 3 1 68ms Build 20 testPurchase OK testSearch OK Build 17 testPurchase OK testSearch OK Build 18 testPurchase FAILED testSearch OK Build 19 testPurchase OK testSearch OK Build # Test Case Status # SQL # Excep CPU 12 0 120ms 3 1 68ms 12 5 60ms 3 1 68ms 75 0 230ms 3 1 68ms Test Framework Results Architectural Data We identified a regresesion Problem solved Lets look behind the scenes Exceptions probably reason for failed tests Problem fixed but now we have an architectural regression Problem fixed but now we have an architectural regression Now we have the functional and architectural confidence
  • 36. 3636 How? Performance Focus in Test Automation Embed your Architectural Results in Jenkins
  • 37. 3737 Version Control System dynaTrace Server Developer CI Server Commit Trigger build Build and run tests Publish performance metrics Drilldown for further analysis Inform about build status Look beyond test pass/fail!
  • 38. 3838 How? Performance Focus in Test Automation Analyzing All Unit / Performance Tests Analyze Perf Metrics Identify Regressions
  • 39. 3939 How? Performance Focus in Test Automation Cross Impact of KPIs
  • 40. 4040 Share Results
  • 41. 41 © 2011 Compuware Corporation — All Rights Reserved Simply Smarter