• Save
Secret Diary of a Performance Tuning Superhero
Upcoming SlideShare
Loading in...5
×
 

Secret Diary of a Performance Tuning Superhero

on

  • 4,284 views

The year is 2012, and many organizations have lost their grip on application performance and scalability as they innovate using agile release cycles and open source technologies. Architectures are at ...

The year is 2012, and many organizations have lost their grip on application performance and scalability as they innovate using agile release cycles and open source technologies. Architectures are at cloud bursting point with logic and data stores becoming more distributed, virtual and elastic than ever before. This session takes a look back at the major performance and scalability bottlenecks of the past year, described using real-life customer case studies. It will cover common architecture pain points, the troubleshooting process and the end root cause many customers faced in production environments. It will also discuss the key challenges and best practices for Dev and Ops teams to manage application performance and scalability in live environments.

Statistics

Views

Total Views
4,284
Views on SlideShare
4,234
Embed Views
50

Actions

Likes
6
Downloads
2
Comments
0

8 Embeds 50

http://lanyrd.com 15
http://www.linkedin.com 13
https://twimg0-a.akamaihd.net 7
https://si0.twimg.com 5
http://www.appdynamics.com 4
https://twitter.com 3
http://us-w1.rockmelt.com 2
http://tweetedtimes.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Secret Diary of a Performance Tuning Superhero Secret Diary of a Performance Tuning Superhero Presentation Transcript

  • Object.wait() SOAP Exception OutOfMemoryException() Connection Pool LeakThread Synchronization Debug Logging of a Performance TuningThread BLOCKED Superhero Stephen Burton AppDynamics @BurtonSays Slow executeQuery()
  • ABOUT ME• Developer• Product Manager• Tech Evangelist• Part-time Superhero
  • AGENDA A L Change T I Drowning in a Pool E N I D Deadly Lock Slippery SOAP N F C O DevOps TipsDeath by Java Beans
  • CHANGE “To improve is tochange; to be perfect is to change often.” Winston Churchill 1874 - 1965
  • HOW OFTEN ARE YOUR RELEASE CYCLES? AppDynamics 2011 Survey: 250+ Monthly 17% 2+ Months 1/3 Experienced 58% 21% Weekly Severity 1 Incident 3% Each Month DailySource: http://www.appdynamics.com/blog/2011/12/14/storm-clouds-in-2012-summary-of-appdynamics-apm-customer-survey/
  • ARCHITECTURES CHANGE JDBC Tomcat MYSQL Monolithic
  • ARCHITECTURES CHANGE SOAP HTTP JBoss ASP.NET Tomcat ADO.NET SOAP JDBC SQL 3rd Party Server SOAP Web Service MySQL Tomcat JMS Tibco BW HTTP LDAP THRIFT Cassandra JBoss Active Directory JMS THRIFT SOAPDistributed 3rd Party Web Service 3rd Party Java App
  • A “simple”login transaction
  • SLIPPERY SOAP• Media Customer• Few Million Users• History of Performance Issues in Production• Angry calls from Users• Tomcat, MySQL & Web Services
  • ARCHITECTURE
  • SOME WEB SERVICES PERFORM OTHERS DON’T
  • EXAMPLE: PAYPALIN 6 WEEKS PERFORMANCE IMPROVED BY 15%
  • Monitor externalservices (avg, max) Only Tune if you have to!
  • DEATH BY JAVA BEANS• Huge E-Commerce Customer• 24 Million Business Transaction a Day• Outage around 6pm in the evening• Lots of Alerts & Angry calls from Users• Tomcat, JBoss, MQ, MySQL, Oracle, Web Services
  • ARCHITECTURE
  • System Metrics look Good
  • JVM Metrics not so Good
  • JVM had huge GC time
  • What caused the GC?Search Transaction uses most CPU, is invoked often and has the most errors & stalls
  • Search Concurrency & Response Time Correlates with GC & CPU Spike
  • Let’s look at a Search
  • EJB Call for every Search Result
  • Every EJB Call makes 500+ SQL Calls12,000+ Queries & associated objects exhausted heap
  • GC can kill your App!What consumes CPU?Transactions, Objects, DataMinimize DB HitsTweak your Heap Size
  • DROWNING IN A POOL• EMEA Television Broadcaster• Reported slowdowns between 7pm and 10pm• Thousands of requests timing out• Users sending complaints via portal• Tomcat, Oracle, Web Services
  • ARCHITECTURE
  • NEW SERVICE LAUNCHED Traffic on service was doubling every week
  • CONNECTION POOL> 2,000 trx/min = DB Connections gone.
  • Developer Operations “I did tell “Wow, thatsOperations this interesting, letabout a month me speak to ago, but they Dev about it” didn’t listen”
  • Monitor your traffic growthConnection Pools are sacredTweak over-timeTest on one node & verifyTune SQL > 500ms
  • TUNING A CAR• Amateurs • Loud Exhaust, Induction Kits • Stupid Rear Wings, Bling Wheels • Go Faster Strips, neon lights• Pros • Service the Car (oil, filters, sparks) • Supercharger, Turbos, Cooling • Dyno Testing, ECU Remaps, Brakes, Suspension
  • Focus on what really makes a difference5% of things are responsible for 95% of performanceBaseline, measure, compare Evangelize your Success!
  • DEADLY LOCK• Huge US Retailer• $1 billion revenue• Random Outages & Slowdowns in Webstore• Revenue would be up/down• Tomcat, ATG, MQ, Oracle, Web Services
  • ARCHITECTURE
  • DEADLOCK DETECTEDSYS ADMINS WOULD JUST RESTART SERVERS
  • THREAD DUMPS
  • CACHE WASN’T THREAD SAFE
  • BUSINESS IMPACT?46,463 Checkouts in the Day Avg. $75 per Checkout 2,492 were impacted Cost of Deadlock: $186,900
  • Restarting Servers isn’t enough!Thread Dumps can be a pain in the ass. Exploit Tools. Make sure caches are thread-safe! (e.g. concurrentHashMap)
  • WHAT ABOUT NEW TECHNOLOGIES? Like Apache Cassandra?
  • IS NOSQL SLOW?If so, what and how slow?
  • SEE THE QUERY RESPONSE TIME
  • CASSANDRA IS A JVM TOO
  • JMX KPI METRICS
  • Make Sure You Monitor all Application ComponentsAll software can have bugs & Bottlenecks!
  • SO, DO YOU KNOWHOW FAST AND RELIABLE YOUR CODE IS IN PRODUCTION? <1% of Developers actually do
  • THE PROBLEM DEV QA/TEST PRODUCTIONConcurrency Data Volume Resource Concurrency Data Volume Resource Concurrency Data Volume Resource Very Fast Fast ?
  • TIPS FOR DEV & OPS• Monitor and Test your Applications in Production• Leverage Application Performance Management (APM)• Feedback Loop from Ops to Dev• Share Goals, Tools & Metrics Lessons Learned •Don’t write slow SQL Queries• Learn from Failure •Remember to use caching •Don’t rely on ORM •Assume everything will fail •Logging everything is not a good idea •Remember to do performance testing
  • WANT TO LEARN MORE? Visit AppDynamics Booth “Kiss my App” T-shirtDownload Free Monitoring Solution www.appdynamics.com/free Follow: @AppDynamics