Releasing fast code - The DevOps approach


Published on

Agile makes you Develop faster, DevOps also makes you Deploy faster but how do you make your Application faster?
Many currently used Performance Management practices don’t work anymore as they are too time consuming. It takes a new approach to track performance in Continuous Integration, get more value out of Load Testing and leverage production data for performance optimization.
We will show you real world examples on how the new DevOps approach can work.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Agile developmentallowstoreact on changedrequirements (technicalandbusiness) fasterReducestheriskofworkingtoolong in thewrongdirectionProvides a „Potential ReleasableProduct“ after every Sprint/IterationIncludesTesting Practices intothe Development LifecycleAutomatesmanytasks such asBuild, FunctionalTestingandprogressreporting (Burn-Down Chart, …)In short, betterquality, lessrisk, fasterreaction time
  • Let‘slookathow a typical Sprint/Iteration works …
  • Atevery time weknowwherewe stand in termsoffunctionality – until -
  • Wegettotestingorproductionwherewe find out thatweeitherhavetogo back and fix problemsidentifiedunderloadorwhenwehavemajorissues in a prodenvironment
  • The resultisthatwe will miss our Goal andwehadwrongestimates
  • LoadTestingandProduction. Often a separate organization.Firstlythismeanswe do not leveragetheadvantageof agile fully. We do not deliverthespeedofchangetothebusinessandthustothecustomer.In a normal shelfproductthismightbefine, ifyouoperate a SAAS oreComercesiteitreducesyourabilitytocompete.Apart fromthatthereare also severalproblemswiththehardsplitbetweenthetwosides. Twooftenmore separate organizationsleadto…
  • Functionality vs. StabilitySeparate teams, different mindset, different incentives. Operationsispaidtokeepthignsstableandrunningwhiledevelopmentistoaddnewfeatures. The funny thingis, thatoperationsisdrivenbycustomerdemands but does not wanttoaddbusinessfunctionalitytobetercompete.Dev on theothersidewantstoaddnewfunctionality all the time, but they lack theexperiencewiththecustomer. They lack the real worldproblems.
  • Distributed Teams makeithardtocoordinateMakescommunication all thathearder. Timezones, langugae different culture in additionto an already different mindset
  • Even worse, ifwewanttocommunicatesomething, thetechnologyis not compatible. Different monitoring, analysistools
  • All ofthisleadsto …I am done – nowyoudeployit
  • Not a surprise – isit?
  • holistic approach to software deliveryPromoteColocationofteamsinsteadofspreadingthemaroundtheworldimaginetheguyrunninganddeployingyourappsitsnexttoyou. Thatwouldmake a lotstuffeasierOneToolsetReduces a lotofoverheadconvertingdata. Reducesmaintaince time. Andyouwouldmakesuretoreuseconfigurationofsaymonitoringfromautomatictest in prod, andfromautomaticbuildandtesttodeploythestufffor real.Agile DevlieryandDeploymentDeliverythe agile promisedirectlytothebusinessandcustomer. React in dayswithnewfeatures. Therearebigonesthat do that. (google, facebooktonametwo)ONE TEAM thatisresponsiblefor a Product (FromDevtoOps)Ifthereis a problem,theguyresponsibleisyou
  • Solution ProposalsCenteredaroundOneToolsetwehave different continuoustasks
  • Automatic, Detect Change – reuseyourexisting Unit TestsBE AWARE thatweare not necesarilymeasuring Performance/Execution Time here -> doesntmake sense forshortrunning Unit Tests. On thosewefocus on theinternals, e.g.: Database Statements andseehowtheychangeover time (regression)Perfearly, insidethecycle. This wayitispartofthesprint not outside. Itisautomted so itdoes not introduce additional effort.
  • Besidesfocusing on theinternaldynamicsofthetestedcomponentswecanruncertainsetsof real performancetests in CIThis exampleshowshowdynaTraceinternallyrunsperformancetests on criticalfeature on everybuildExplaintheidentifiedproblems …
  • In ordertorunperformancetests in CI itisimporttofollowsomegroundrulesUnit testsneedtohave a certainexecution time lengthMakesurethatyouhave a stableenvironment -> tip: runseveralwarm-upteststoensureenvironmentisstable -> thenrun real test
  • In Java/.NET itisimportanttoexclude volatile externalfactors such asGarbageCollection Time
  • Themoreinterestingusecase in CI isArchitecture Validation wherewehave a verycloselookatwhatisgoing on withinthetestedcomponents.Byanalyzingthedynamiccodeexecutionwecanvalidatecertainrulesthataredefinedbythe Software Architects, e.g.: Onlymakeasycnhronousremotingcalls, only X DB Calls per transaction, …Havingthetools in placeallowyoutoautomaticallyverifytheserulesagainstyourunitorfunctionaltests
  • Weseemoreandmoreexternalcodebeingused – whichis a goodthing – but canhave a bigimpact on performanceifwe do not understandtheinternals
  • Toomucheffortforscriptmaintenance, settingupthetestenvironmentCostsfortestingtoolsandtestenvironmentTest cycletakeslongerthanreserved in thesprint. Feedback todevscomeswhentheyarealready in thenextsprint
  • Sample Data: smalldatabaseswith sample datais not realistic. Canttestcaching, access time todatabase, …OverloadedAgents: LoadTestingAgentsare just usedtoproduceloadwithoutverifyingiftheycanproducetherequestedloadWrong Think Times: Think Times areeither not setcorrectlytoreflect real usersortheyareavoidedtoincreaseload -> thats not realisticeitherInfrastructure Issues: doesthe Test Infrastructure provideenoughbandwidth, resources, … to handle theload? Isthereanythingelserunning on theinfrastructurethatmayinterfere?No Error Detection: Missing Content Verificationleadstoincorrectresults. A Page thatresponds super fast but alwaysyresponds „pleasecome back later“ is a problemNo Server Side Data: veryoftenpeopleforgettocollectserversidemetrics such as CPU, Network, I/O, Logfiles, …
  • classical loadtestingtest environment that tries to be as close to production as possible (size/users/data vs. reality)test scripts with transactions that we expect (but we know we shouldexpecttheunexpected. Never happens)TRIES TO REBUILD TO REAL WORLD -> HUGE EFFORT!!Result: no real guarantee that what has been tested really works in prodProblems:Testing on not frozen Code Base -> Results of Testing are available not before mid or end of NEXT Sprint -> Results are outdated!HArd to get real life Production Load BehaviorMore changes in application lead to more effort in testing (script maintenance, regression analysis, ...)Size ofenvironmentNumberof UsersTransaction Mix
  • Rerun, additional dataAdding log statemetnsandstatistics in code, errorprone.First, automatethestuff.Second, captureenoughfromthebeginning. (traces, BTM)Dynamic Capture all the time.Overhead discussion 5-10Benefits of Agile and DevOpsShorter iterations with fewer changes leads to reduced effort in testingfewer script maintenance changes, higher quality smoke test, easier regression analysis,more focused testing on the areas that have changed
  • Makesurewecaptureenoughinfrastructuredata – not just responsetimes. This helpsustoget a quick overviewoftheproblematicareas
  • In ordertoidentifyproblematiccomponentsorservicesitsrecommendedtorun different loadandseewhichcomponentsscaleandwhichdont
  • You do not onlyrunoneloadtest in a devcycle. Yourunmanyandalwayswanttoseewhathaschanged.Itisthereforeimportanttocomparetestresults. Critical hereisthatwehaveteststhatwecancompareandthatwehavedatathatiscomparable
  • Inthe end itisthedeveloperthatneedstolookattheproblem. The more granular transactionaldatawecanprovidetheeasiertoanalyzeand fix theproblem.ProfilingandTracing Solutions becamevery powerful andallowdeeptracing also duringload.
  • Real environment, real validityLesscost, lessmaintaince,lesseffort.Yoursystem will bemoststableandyou will learnperformancemanagement.Test during off-hours/off-daysorduringmaintainencecyclesOff peaktimesnewversiontestloadtestingPart ofproductionwithnewversionShadow production, trafficduplication
  • During off hourswetakesomeoftheproductionmachines out ofthecluster, installthetestversion on it, configuretheloadbalancerandrunloadagainstitusing a loadtestingtoolThis allowsustotest on real hardware
  • This is not alwayspossible – depending on thechanges in thesoftware. This howeverisgreattotestpatches on howtheybehave. Inboundtrafficisduplicated. Wecanverifyifweseeanyfunctionalproblemsaswellashowtheperformancecomparestothe „current“ system
  • In thiscasewe update a setofmachineswith a newversion/patch. The loadbalancerisconfiguredto route certainuserstothenewmachinestotest out howthisworks.
  • Testing real numberofusers, with real loadand on yourproductionwith real contentSimulatenumberofusersasyoucouldn‘tyourselfReal end userperformanceIfyouknowwhereyouruserbaseis, testfromthereSee localdifferencesAt a fractionofthecost, ist on demand.Letthemmaintaintheirscripts, concentrate on yourproductLeverageserviceslikekeynoteandgomes. Simulate real userload, and not like in loadtesting just transactionload.provisioning in the cloud -> dynamic resources -> cater for millions of users
  • Knowwhat‘sreallygoing onReal problemsNoreproductionOptimzebased on what‘s realNoguess – just factsDirectfeedbacktodevUseProductiontoanalyzetheperformance. Reactto real problems. Noreproduction.Tune andoptimzebased on real worldtransactionanduserload.We solve the "Black Box View" of the test departmentWe dont know the insight of the app -> so what shall we look for?We dont know how the real prod evnrionment looks like -> so how shall we know if it works in prod or notAPM in Produktion --> fast feedback to dev (same team anyway?), agile cycles allow formsquick fixesless change, smaller errors, faster solution
  • The real environmentisalways different thanwhatyouhadtested. Thereforeitisimportanttoanalyzethesystemandseehowtransactionsflowandwheretheymighthave a problemThis allowsyoutofocus on thoseservices/componentsthathave a problem in real life
  • Analyzingthebusinessimpactofyourappsperformanceisimportanttoprioritizeyourtroubleshootingactivities
  • Kreis mit pfeilen, mitte ein tool, communicationbarrier, objectivness, lessmaintaincePerformance analysisPerformance testingPerformance monitoringvalidation
  • Agile + DevOps + CAPMChanged Testing Mindest -> WE ARE NEVER DONE -> WE CONSTANTLY IMPROVE
  • Releasing fast code - The DevOps approach

    1. 1. Releasing Fast CodeThe DevOps Approach to Performance
    2. 2. Why Agile Development took off
    3. 3. It‘s Sprint TimeStory Points Developme Testing Estimate nt Sprint Timeline Production Remaining Team Velocity
    4. 4. You are in controlStory Points Developme Testing Estimate nt Sprint Timeline Production Remaining Team Velocity
    5. 5. What happened?Story Points Developme Testing Estimate nt Sprint Timeline Production Remaining Team Velocity
    6. 6. Missed Goals and EstimatesStory Points Estimates Missed Developme Testing Estimate nt Production Missed Remaining Goal Team Velocity
    7. 7. What is the Status Quo beyond Dev?Source:
    8. 8. Problem #1: Different MindsetSource:
    9. 9. Problem #2: Dislocated TeamsSource:
    10. 10. Problem #2: Different ToolsSource:
    11. 11. Problem #4: Over the Fence AttitudeSource:
    12. 12. These Problems lead to …Source:
    13. 13. A potential SolutionSource:
    14. 14. Real World Perf Test in Feedback CI ONECloud based Toolset Architecture ValidationTesting Test in Production Traditional Load Testing
    15. 15. Performance in Continuous Integration
    16. 16. Identify Performance Regressions Early
    17. 17. Tip: Stable Environment
    18. 18. Tip: Exclude Garbage Collection
    19. 19. Architecture Validation Analyze your existing Unit Tests Validate against your Arch Rules
    20. 20. Learn how foreign code works• Get insight into whats going on „under the hood“ • Understand internals of the frameworks you use • Choose the right usage for your use case
    21. 21. Traditional Load Testingtakes too long in an agile process time Sprint 1 Starts with Dev and dedicated Time for Test at the End Update Test Setup Run Tests Scripts Environment Adapt Scripts Re Run Tests Feedback to Dev Testing extends sprints Consumes too many Dev Resources Sprint 2 Re Run Tests Feedback to Dev Ready for Ops Either postponed or impacted because Dev Resources are still occopuied with testing of previous sprint
    22. 22. Common MistakesOnly use Sample Data No Server Side DataWrong Think Times Overloaded Load AgentsNo Error Detection Infrastructure Issues
    23. 23. Load Testing needs to be „real“ Real Size and Content Real Third Party CodeReal Infrastructure Setup No Guarantees! Real Load Real Number of Users
    24. 24. Minimize and automate real Load TestsDeveloping Test Run Reproduction Refine Capturing Re Run Tests Reproduction Refine Capturing Multiple Test Iterations needed to analyze Root-cause Re Run Tests Reproduction Problem Analysis Problem Solving timeDeveloping Test Run Reproduction Refine Capturing Re Run Tests Reproduction Refine Capturing •Eliminates Test Iterations •Go directly to problem analysis •Frees up resources for other proje Re Run Tests Reproduction Problem Analysis Problem Solving time
    25. 25. Tip: Measure more than Response Time
    26. 26. Tip: Identify Scalabilty Problems Response time Presentation Layer Presentation Layer Business Logic Presentation Layer Business Logic Business Logic Web Service/Remoting Web Service/Remoting Web Service/Remoting Data Access Data Access Data Access Minimum Load Medium Load Maximum Load
    27. 27. Incresing Load Test is your friend
    28. 28. Tip: Identify Regressions on Code Level
    29. 29. Tip: Get as granular data as possible
    30. 30. Test in your Production Environment Real Environment Real Validity Less Cost
    31. 31. Scenario 1: Use Production Hardware Real Users Production Environment Virtual Users LB/FW Web Server App Server DB Route Incoming Use some machines Traffic during off hours for testing
    32. 32. Scenario 2: Duplicate Traffic Real Users Production Environment LB/FW Web Server App Server DB Send incoming traffic to Prod and Testing Machines
    33. 33. Scenario 3: Partial Update Real Users Production Environment LB/FW Web Server App Server DB Update some machines to the version that should be tested
    34. 34. Use Cloud based Testing Services Real End User TimeCDN Proxies High ControlVolume Costs Last Mile Load Balancers
    35. 35. APM in Production
    36. 36. Understand your production environment
    37. 37. Focus on areas with biggest impact160140120100 80 Number of Sales 60 No of Users 40 20 0 Response Time Transaction Load
    38. 38. ONE set of Tools Communication Collaboration
    40. 40. Michael Koppmichael.kopp@dynaTrace.com