Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Boston DevOps Days 2016: Implementing Metrics Driven DevOps - Why and How

1,548 views

Published on

How can we detect a bad deployment before it hits production? By automatically looking at the right architectural metrics in your CI/CD and stop a build before its too late. Lets hook up your test automation with app metrics and use them as quality gates to stop bad builds early!

Published in: Software

Boston DevOps Days 2016: Implementing Metrics Driven DevOps - Why and How

  1. 1. Implementing Metrics-Driven DevOps Why and How! Andreas Grabner: @grabnerandi, andreas.grabner@dynatrace.com Slides: http://www.slideshare.net/grabnerandi Podcast: https://www.spreaker.com/show/pureperformance
  2. 2. @grabnerandi
  3. 3. @grabnerandi
  4. 4. @grabnerandiAND MANY MORE
  5. 5. @grabnerandi https://dynatrace.github.io/ufo/ “In Your Face” Data!
  6. 6. @grabnerandi Availability dropped to 0% #1: Availability -> Brand Impact
  7. 7. @grabnerandi New Deployment + Mkt Push Increase # of unhappy users! Decline in Conversion Rate Overall increase of Users! #2: User Experience -> Conversion Spikes in FRUSTRATED Users!
  8. 8. @grabnerandi #3: Resource Cons -> Cost per Feature
  9. 9. @grabnerandi App with Regular Load supported by 10 ContainersTwice the Load but 48 (=4.8x!) Containers! App doesn’t scale!! #4: Scalability -> Cost per User
  10. 10. @grabnerandi #5: Performance -> Behavior
  11. 11. @grabnerandi
  12. 12. @grabnerandi DevOps @ Target presented at Velocity, DOES and more … http://apmblog.dynatrace.com/2016/07/07/measure-frequent-successful-software-releases/ “We increased from monthly to 80 deployments per week … only 10 incidents per month … … over 96% successful! ….”
  13. 13. “We Deliver High Quality Software, Faster and Automated using New Stack“ „Shift-Left Performance to Reduce Lead Time“ Adam Auerbach, Sr. Dir DevOps https://github.com/capitalone/Hygieia & https://www.spreaker.com/user/pureperformance “… deploy some of our most critical production workloads on the AWS platform …”, Rob Alexander, CIO
  14. 14. 2 major releases/year customers deploy & operate on-prem 26 major releases/year 170 prod deployments/day self-service online sales SaaS & Managed 2011 2016
  15. 15. @grabnerandi Not only fast delivered but also delivering fast! -1000ms +2% Response Time Conversions -1000ms +10% +100ms -1%
  16. 16. Why most (will) fail!
  17. 17. @grabnerandi
  18. 18. @grabnerandi It‘s not about blind automation of pushing more bad code on new stacks through a pipeline
  19. 19. @grabnerandi It‘s not about blindly adding new features on top of existing withouth measuring its success
  20. 20. @grabnerandi I learning from others
  21. 21. @grabnerandi http://bit.ly/sharepurepath
  22. 22. @grabnerandi Scaling an Online Sports Club Search Service 2015201420xx Response Time 2016+ 1) 2-Man Project 2) Limited Success 3) Start Expansion 4) Performance Slows Growth Users 5) Potential Decline?
  23. 23. @grabnerandi Early 2015: Monolith Under Pressure Can‘t scale vertically endlessly! May: 2.68s 94.09% CPU Bound April: 0.52s
  24. 24. @grabnerandi From Monolith to Services in a Hybrid-Cloud Front End to Cloud Scale Backend in Containers!
  25. 25. @grabnerandi Go live – 7:00 a.m.
  26. 26. @grabnerandi Go live – 12:00 p.m.
  27. 27. What Went Wrong?
  28. 28. @grabnerandi 26.7s Load Time 5kB Payload 33! Service Calls 99kB - 3kB for each call! 171!Total SQL Count Architecture Violation Direct access to DB from frontend service Single search query end-to-end
  29. 29. @grabnerandi The fixed end-to-end use case “Re-architect” vs. “Migrate” to Service-Orientation 2.5s (vs 26.7) 5kB Payload 1! (vs 33!) Service Call 5kB (vs 99) Payload! 3!(vs 177) Total SQL Count
  30. 30. @grabnerandi
  31. 31. @grabnerandi You measure it! from Dev (to) Ops
  32. 32. @grabnerandi Build 17 testNewsAlert OK testSearch OK Build # Use Case Stat # API Calls # SQL Payload CPU 1 5 2kb 70ms 1 35 5kb 120ms Use Case Tests and Monitors Service & App Metrics Build 26 testNewsAlert OK testSearch OK Build 25 testNewsAlert OK testSearch OK 1 4 1kb 60ms 34 171 104kb 550ms Ops #ServInst Usage RT 1 0.5% 7.2s 1 63% 5.2s 1 4 1kb 60ms 2 3 10kb 150ms 1 0.6% 3.2s 5 75% 2.5s Build 35 testNewsAlert - testSearch OK - - - - 2 3 10kb 150ms - - - 8 80% 2.0s Metrics from and for Dev(to)Ops Re-architecture into „Services“ + Performance Fixes Scenario: Monolithic App with 2 Key Features
  33. 33. @grabnerandi your tool of choice #SQL, #Threads, Bytes Sent, # Connections WPO Metrics, Objects Allocated, ...
  34. 34. @grabnerandi https://github.com/Dynatrace/Dynatrace-Test-Automation-Samples https://dynatrace.github.io/ufo/ Fail the build!
  35. 35. @grabnerandi Dev&Test: Check-In Better Code Performance: Production Ready Checks! Validate Monitoring Ops/Biz: Provide Usage and Resource Feedback for next Sprints Test / CI: Stop Bad Builds Early Build & Deliver Apps like the Unicorns! With a Metrics-Driven Pipeline!
  36. 36. @grabnerandi 12:00 a.m – 11:59 p.m.
  37. 37. Questions Slides: slideshare.net/grabnerandi Get Tools: bit.ly/dtpersonal Watch: bit.ly/dttutorials Follow Me: @grabnerandi Read More: blog.dynatrace.com Listen: http://bit.ly/pureperf Mail: andreas.grabner@dynatrace.com
  38. 38. Andreas Grabner Dynatrace Developer Advocate @grabnerandi http://blog.dynatrace.com

×