Evolving toward devops through transaction centric monitoring

429 views

Published on

I gave this presentation at a London FITE club meeting on March 5th 2013
http://eventful.com/events/fite-club-peter-holditch-appdynamics-evolving-toward-devop-/E0-001-054955463-8?utm_source=apis&utm_medium=apim&utm_campaign=apic

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
429
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Image: http://marketingforhippies.com/fast-marketing-vs-slow-marketing/
  • Image: http://www.side-out.org/blog/dealing-with-fear/
  • Image: http://blogs.telegraph.co.uk/technology/alexisdormandy/100007050/the-information-flood-businesses-are-drowning-in-data/
  • Image: http://blogs.ubc.ca/etec540sept09/2009/11/19/this-is-just-the-beginning/
  • Image: http://blogs.ubc.ca/etec540sept09/2009/11/19/this-is-just-the-beginning/
  • Objective of SlideEducate audience that operations today manage “Servers” instead of the “End User Experience”, i.e. business transactions.ScriptWhen it comes to troubleshooting application performance and availability, the common approach today is to analyze the health of the infrastructure which underpins the application. Unfortunately Operations lack the business context of what matters to your end-users or customers – the transaction they were executing when the problem occurred.A green or healthy infrastructure does not mean a green and healthy end user experience.The fact is end-users experience Business Transactions. They don’t experience servers or application infrastructure.When an end user complains, they normally say “I can’t Login” or “Checkout is very slow”, they don’t complain about Memory Leaks, Server CPU or Thread Pools.
  • Image: http://www.processexcellencenetwork.com/people-performance-and-change-in-process-improveme/columns/i-can-t-believe-you-want-me-to-use-that/
  • Objective of SlideDiscover whether the Prospect is going through these app architecture shifts and understands that their current approach/tools are insufficient in these environmentsTwo clear takeaways: Modern applications are harder to manage because they are a) more complex and b) change fasterPlant the seed that the only constant as apps have changed is the “business transaction”Note: Copy/Paste the industry specific transaction content from the appendix to tailor this graphic for your audienceScriptIn the early 2000s, application architectures were fairly simplistic consisting of a monolithic 3-tier architecture - with a user request resulting in a call to an application server and then a query to some backend database such as Oracle.Over time, the application architectures and operating environments have grown in complexity. While these shifts have been good for application developer productivity and agility, they have made modern applications more difficult to manage. The shifts that have had the most impact on IT Operations & App Support teams are:[BUILD SOA]SOA: Service Oriented Architecture[BUILD CLOUD & BIG DATA]Cloud Capacity: Usage of Cloud Capacity from providers like Amazon EC2 and private cloudsBig Data: Surge in data volumes popularizing Big Data and NoSQL technologies such as Hadoop, Cassandra and MongoDB[BUILD WEB 2.0 & AGILE]Web 2.0: More business logic and server side processing shifting to the browser with the advancement of Web 2.0 Agile: And to complicate things even further, more frequent code release cycles with the adoption of agile developmentAll 5 of these technologies have created the perfect storm for operations and development trying to manage the performance and availability of their application due to the high rate of change these teams are facing. To add to this challenge, legacy monitoring approaches weren’t built to support these environments.Ask: Has your organization embraced some of these approaches? If so, which ones? How has that affected your ability to manage performance and determine root cause?
  • Objective of SlideDiscover whether these pains/situations resonate with how they operate today.Perhaps get them to laugh and expose some 3rd level pain (personal pain – related to tension between Dev & Ops)Show prospect how their current approach increases the MTTR because their current set of tools and methodology has:No correlationMultiple toolsPoor collaborationAll three of the above:Increase the time to resolve application issuesCosts the company more money hogging resources to fix an issueImpacts the businessScriptFor instance, when end-users complain about a slow business transaction such as “Checkout” as shown on the left, we often see Ops and Dev teams working in silos, using multiple tools to analyze the health of the application’s infrastructure as shown on the right. When we meet customers for the first time, we often hear about their frustration with the blame-game between groups, and the ineffectiveness of late-night war room conference calls.Customers tell us that even though they had a lot of data - they couldn’t resolve application issues in a timely manner because their old tools didn’t gather the right information and didn’t allow Dev & Ops to get on the same page and inspect the same metrics.Ask: Does this sound familiar? [BUILD takes you to next slide]
  • Objective of SlideDiscover whether the Prospect is going through these app architecture shifts and understands that their current approach/tools are insufficient in these environmentsTwo clear takeaways: Modern applications are harder to manage because they are a) more complex and b) change fasterPlant the seed that the only constant as apps have changed is the “business transaction”Note: Copy/Paste the industry specific transaction content from the appendix to tailor this graphic for your audienceScriptIn the early 2000s, application architectures were fairly simplistic consisting of a monolithic 3-tier architecture - with a user request resulting in a call to an application server and then a query to some backend database such as Oracle.Over time, the application architectures and operating environments have grown in complexity. While these shifts have been good for application developer productivity and agility, they have made modern applications more difficult to manage. The shifts that have had the most impact on IT Operations & App Support teams are:[BUILD SOA]SOA: Service Oriented Architecture[BUILD CLOUD & BIG DATA]Cloud Capacity: Usage of Cloud Capacity from providers like Amazon EC2 and private cloudsBig Data: Surge in data volumes popularizing Big Data and NoSQL technologies such as Hadoop, Cassandra and MongoDB[BUILD WEB 2.0 & AGILE]Web 2.0: More business logic and server side processing shifting to the browser with the advancement of Web 2.0 Agile: And to complicate things even further, more frequent code release cycles with the adoption of agile developmentAll 5 of these technologies have created the perfect storm for operations and development trying to manage the performance and availability of their application due to the high rate of change these teams are facing. To add to this challenge, legacy monitoring approaches weren’t built to support these environments.Ask: Has your organization embraced some of these approaches? If so, which ones? How has that affected your ability to manage performance and determine root cause?
  • Objective of SlideDiscover whether the Prospect is going through these app architecture shifts and understands that their current approach/tools are insufficient in these environmentsTwo clear takeaways: Modern applications are harder to manage because they are a) more complex and b) change fasterPlant the seed that the only constant as apps have changed is the “business transaction”Note: Copy/Paste the industry specific transaction content from the appendix to tailor this graphic for your audienceScriptIn the early 2000s, application architectures were fairly simplistic consisting of a monolithic 3-tier architecture - with a user request resulting in a call to an application server and then a query to some backend database such as Oracle.Over time, the application architectures and operating environments have grown in complexity. While these shifts have been good for application developer productivity and agility, they have made modern applications more difficult to manage. The shifts that have had the most impact on IT Operations & App Support teams are:[BUILD SOA]SOA: Service Oriented Architecture[BUILD CLOUD & BIG DATA]Cloud Capacity: Usage of Cloud Capacity from providers like Amazon EC2 and private cloudsBig Data: Surge in data volumes popularizing Big Data and NoSQL technologies such as Hadoop, Cassandra and MongoDB[BUILD WEB 2.0 & AGILE]Web 2.0: More business logic and server side processing shifting to the browser with the advancement of Web 2.0 Agile: And to complicate things even further, more frequent code release cycles with the adoption of agile developmentAll 5 of these technologies have created the perfect storm for operations and development trying to manage the performance and availability of their application due to the high rate of change these teams are facing. To add to this challenge, legacy monitoring approaches weren’t built to support these environments.Ask: Has your organization embraced some of these approaches? If so, which ones? How has that affected your ability to manage performance and determine root cause?
  • Objective of SlideShow how Business Transactions give Dev & Ops the most relevant “Unit of Management” to assess application health.Show how we fix their problems as easy as 3-steps vs their current mode of troubleshooting.ScriptWhat customers like about AppDynamics is that we measure Application health by Business Transaction, and this gives them context that is relevant to the four constituents that matter a) end-users, b) line-of-business people, c) R&D, d) OperationsWhat we typically hear by the end of a PoC, is that Dev & Ops like how AppDynamics gives them a “unit of management” that they can agree on and a “single pane of glass” that eliminates conflict and finger-pointing.Now with AppDynamics in place, if a customer encounters the same “slowness with the Checkout transaction”, they can use AppDynamics to:[BUILD]Monitor all of your application’s business transactions and the details about each one - such as the end-user response times, total number of calls and health of their service levels.Troubleshoot the issue by isolating where it’s occurring in an intuitive format that looks like a “google traffic map”Resolve bottlenecks faster by finding the root cause of the problem and supplying you with the exact line of code as you can see here in this last screen.
  • Image: http://www.techbrarian.com/2009/05/27/assignment-39-7th-and-8th-grade-assignment-37-6th-grade-robots-part-2-the-turing-test/
  • Evolving toward devops through transaction centric monitoring

    1. 1. Evolving toward DevOps through transaction- centric performance management Peter Holditch pholditch@appdynamics.com
    2. 2. Development • More functionality • More quickly • With fewer resources Operations • Performance • 24x7 availability • With fewer resources 2 Copyright © 2012 AppDynamics. All rights reserved. Business imperatives
    3. 3. 3 Copyright © 2012 AppDynamics. All rights reserved. Is change avoidance the only strategy? • Change brings risk by making the system an unknown quantity (again)
    4. 4. • Instrument it… BUT… • Do so without extra burden on the developers… • …and without tight dependencies on any given software release • Use automatic baselining; do you KNOW what each of thousands of metrics should be? Do you expect them to vary? 4 Copyright © 2012 AppDynamics. All rights reserved. Know the system…
    5. 5. Knowledge is power… 5 Copyright © 2012 AppDynamics. All rights reserved. • Now we know what to expect, we can understand when things are abnormal and be proactive… • …and we can quickly see any detrimental effects introduced by changes
    6. 6. Knowledge is power… 6 Copyright © 2012 AppDynamics. All rights reserved. • Now we know what to expect, we can understand when things are abnormal and be proactive… • …and we can quickly see any detrimental effects introduced by changes • Why confine this to production monitoring? • Instrumented performance testing should be part of the standard CI testing regime
    7. 7. The classical mode of metric collection… Webserver Appserver JMXMetrics JVM …
    8. 8. • Is all about the responsiveness to the user… • …so we need to measure transaction response times for a useful picture 8 Copyright © 2012 AppDynamics. All rights reserved. Application availabilty… Login My Accounts Make Payment Transfer Funds
    9. 9. BIG DATA Hadoop Cassandra MongoDB Coherence Memcached CLOUD Amazon EC2 Windows Azure VMWare What is “the application” anyway? Weblogic Oracle .NET MQ ATG, Vignette, Sharepoint SQL Server JBoss Tomcat Tomcat Mule, Tibco, AG ESB .NET Tomcat SOA WEB 2.0 Browser Logic AJAX Web Frameworks Release 3.4 Release 3.5 Release 3.6 Release 4.0 AGILE Release 1.1 Release 1.2 Release 1.23 Release 1.5 Release 4.4 Release 4.5 Release 4.6 Release 5.0 Release 2.4 Release 2.5 Release 2.6 Release 3.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Login My Accounts Make Payment Transfer Funds
    10. 10. End User Slow Transaction I don’t know either. I can’t reproduce it. The QA build isn’t exactly like Production. Can’t tell from the logs. Sorry. On it now… need Emily’s help though. I told you. It’s NOT the network!! Marcus, did you test this in QA? Call Dev ASAP! Steve, did you check the logs? It’s been unstable since last week’s release!! ds Not my code… The guy who wrote it isn’t here anymore. So, now we know response times are slow… Ops Dev
    11. 11. BIG DATA Hadoop Cassandra MongoDB Coherence Memcached CLOUD Amazon EC2 Windows Azure VMWare We need the execution context… Weblogic Oracle .NET MQ ATG, Vignette, Sharepoint SQL Server JBoss Tomcat Tomcat Mule, Tibco, AG ESB .NET Tomcat SOA WEB 2.0 Browser Logic AJAX Web Frameworks Release 3.4 Release 3.5 Release 3.6 Release 4.0 AGILE Release 1.1 Release 1.2 Release 1.23 Release 1.5 Release 4.4 Release 4.5 Release 4.6 Release 5.0 Release 2.4 Release 2.5 Release 2.6 Release 3.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Login My Accounts Make Payment Transfer Funds
    12. 12. BIG DATA Hadoop Cassandra MongoDB Coherence Memcached CLOUD Amazon EC2 Windows Azure VMWare We need the execution context… Weblogic Oracle .NET MQ ATG, Vignette, Sharepoint SQL Server JBoss Tomcat Tomcat Mule, Tibco, AG ESB .NET Tomcat SOA WEB 2.0 Browser Logic AJAX Web Frameworks Release 3.4 Release 3.5 Release 3.6 Release 4.0 AGILE Release 1.1 Release 1.2 Release 1.23 Release 1.5 Release 4.4 Release 4.5 Release 4.6 Release 5.0 Release 2.4 Release 2.5 Release 2.6 Release 3.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Release 1.4 Release 1.5 Release 1.6 Release 2.0 Login My Accounts Make Payment Transfer Funds • The modern approach to building apps as sets of discrete services makes this ever more important
    13. 13. MONITOR TROUBLESHOOT End User Slow Transaction Ops Dev + RESOLVE …to foster troubleshooting collaboration…
    14. 14. e.g. 2 outages in 2 days… Trx/min Avg RT Pool Limit Pool Usage Trx Stalls Production Ground to a halt for 2 hours And again the next day Who owns JVM Configuration? Ops or Dev?
    15. 15. e.g. excessive data access
    16. 16. e.g. slow sql queries
    17. 17. e.g. nested loop code logic
    18. 18. And once you have the metrics… 18 Copyright © 2012 AppDynamics. All rights reserved. PaaS?
    19. 19. Why not try the product? • Use AD LITE for free • Try out AD Pro Edition 19 Copyright © 2012 AppDynamics. All rights reserved. Questions? Why not have a beer? • Now, around the corner • Tomorrow after Qcon Westminster Arms 9, Storey’s Gate SW1P 3AT

    ×