Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

Andreas Grabner - Performance as Code, Let's Make It a Standard

  1. Performance as Code - Let’s Make it a Standard! Andreas Grabner
  2. It started with “Performance Signature”
  3. Idea came from Thomas Steinmaurer “Performance Signature” for Build Nov 16 “Performance Signature” for Build Nov 17 “Performance Signature” for every Build “Multiple Metrics” compared to prev Timeframe Simple Regression Detection per Metric
  4. used a similarapproach for Jenkins
  5. createdJenkins PerformanceSignaturePlugin 1.8 1.5 250 100 98th Percentile Response Time Throughput (Request-Count) > 1.8s Build Failed > 1.5s Build Warning < 1.5s Build GOOD Performance Signature Definition Timeseries Evaluation Build Result < 100 Build Failed < 250 Build Warning > 250 Build GOOD •
  6. Hereis how Plugin looks in Jenkins Aggregated per Build Build to build trends Full Details per Build •
  7. I also baked theconcept into the “Unbreakable Pipeline” 1 2 4 53 Production Staging Approve Staging Production Approve Production CI CD CI CD CI CD CI CD Push Context Auto-Quality Gate Push Context Auto-Validate Auto-Remediate! Build #17 Build #18
  8. Netflix & Google introducedKayenta • https://cloud.google.com/blog/products/gcp/introducing-kayenta-an-open- automated-canary-analysis-tool-from-google-and-netflix • Automated Scoring of Canary Deployments
  9. Additionalideas in workshop withHenrikRexed (Neotys) Create Tests based on Monitoring Data Push LT Thresholds to Monitoring for Alerting Pull Monitoring Data into Test Results
  10. Let’scombine forces,defineuse cases and createa standard • Kayenta
  11. Use Cases for Performance as Code
  12. Use Case #1: PerformanceFeedback to Developers • Current Feedback Loops: Reactive • Often driven by Performance Teams that execute tests • Sharing results with Engineering in case of a issues • Performance Plugins in CI/CD for Load Testing Data Only • Proposal: Performance Signature as Code • Automated Feedback across FullStack Metrics • Let the engineers define what information they are interested in: load testing, platform (k8s, CF, OpenShift, …), runtimes (JVM, Node …), dependencies, … • Version Control it next to their source code • Pull this data with every commit/build and push it developers (Jenkins, Slack, IDE …) • Requirement: Metrics Definition tool independent!
  13. Use Case #2: Deployment / Canary Validation • Current Validation: Reactive • Looking at Monitoring Dashboards • Manual process to validate a Quality Gate or Validate Production Deployment • Proposal: Deployment Score Definition as Code • Validate on Thresholds and Baseline Definitions • Define weights on key metrics • Automate validation of metrics and calculate deployment score • Can be triggered as a quality gate in CI/CD or continuously evaluated in production deployments • Requirement: Metrics Definition for Service Endpoints
  14. Example for #1 & #2:PerformanceSignature(Monitoring) Dev Team: “Hey Davis, how does deployment Carts17 look like?” Davis: “Your build score is 95! Here are the key charts for your performance signature! Want to compare it with a previous build?” Metric Definition Signature (Thresholds) Deployment Score
  15. Signature (Baseline) Example for #1 & #2:PerformanceSignature(Testing) Metric Definition Signature (Thresholds) Endpoint Definition
  16. Use Case #3: Test Generation • Current: Test Generation separate from Code Generation Process • Testers record or write test cases after syncing with engineers on testable APIs • Testers record on an available build of the service/app • Testers need to update tests in case code changes introduce changes • Proposal: Endpoint & Workload Definitions as Code • Engineers define endpoints, expected workload & behavior in code • Tests can be generated automatically based on definition • Changes to Definition can automate update of tests as part of commit/Pull Requests
  17. Example for #3 Behavior (Workload) Test DefinitionEndpoint Definition Generate
  18. Use Case #4: MonitoringAlertingDefinition • Current: Dynamic Baselining or manual threshold configuration • Dynamic Baselining needs time to learn • Dynamic Baselining is not aware of “expected” performance changes and may result in initial false/positives • Maintaining Manual Thresholds doesn’t scale as we move towards continuous deployment • Proposal: Baseline Definitions as Code • Could be take from Load Test Workload Definitions • Could be generate based on actual accepted load testing results • Definition becomes part of the Deployment Artifact • Monitoring tool auto-adjusts baseline settings based on accepted pre-prod values
  19. Example for #4: MonitoringAlertingDefinition > genspec abcd123.live.dt.com org1000-app2000 mymonspecfile.json Signature (DT specifics) Generate Rules Generate Config Metric Definition
  20. Use Case #5: Auto-RemediationDefinition • Current: Operations starts to auto-remediate issues seen in production • Reactive approach as potential new problems are only automated once they impacted production once • For Cloud Native: we see more remediations happening in app-layer (switch feature flags, canaries, …) which require app-specific remediation actions • Proposal: App- & Metric-specific remediation actions as Code • IF / THEN Conditions on metric or baseline violation • List of proposed Remediation Actions
  21. Example #5: Auto-RemediationDefinition Push Deployment Event
  22. Use Case #6: Pre-Deploy Environment Checks • Currently: Many test results invalid due to environment issues • Depending services not ready • Other tests running and impacting results • Configuration mistakes, e.g: pointing to wrong backend database • Proposal: Expected Environment Definition as Code • List of Metrics that validate status of “desired state!”
  23. Use Case #7: Event-DrivenPerformancePipelines • Current: Test results analyzed at end but not during the run • Slowing down pipeline executions • Unnecessarily using resources of tests that we know will fail anyway • Proposal: Continuously evaluate Deployment Score during a Test • Create Events in case of a current violation • Testing Tool can consume event and stop/fail test early • Pipeline can consume events and stop/fail pipeline early • Events can be pushed to Slack to pro-actively alert engineers on violations
  24. Example #7: Event-DrivenPerformancePipelines TESTDEV Push Monspec Build & Deploy Smoke, Func Tests Perf Tests Deploy & Push Monspec Watchdog/Operator Watchdog/Operator PROD Deploy & Push Monspec EventsProblem Notifications Event: ORG-APP-SERVICE Deploy Event + Define Alerts Abort Build Build Score! Abort Build Build Score!
  25. Wheredoes “MonSpec/PerfSpec”live? • #1: Lives in Source Control and pushed to individual tools • #2: Part of Service Deployment and accessible through endpoint • Similar to healthprobe endpoints in k8s • Every tool can query it from the live service instance, e.g: http://myservice/monspec
  26. Discussion • What use cases have we missed? • Who is already doing something similar? • Wo else do we need to talk to / include in this conversation?
  27. Thank you
Advertisement