Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Predictive Analytics based Regression Test Optimization


Published on

by Raja Balusamy, Group Manager & Shivakumar Balur, Senior Chief Engineer, Samsung R&D at STeP-IN SUMMIT 2018 - 15th International Conference on Software Testing on August 30, 2018 at Taj, MG Road, Bengaluru

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Predictive Analytics based Regression Test Optimization

  1. 1. Raja Balusamy, Group Manager Shivakumar Balur, Senior Chief Engineer Samsung R&D Institute India - Bangalore Predictive Analytics based Regression Test Optimization 1
  2. 2. • Why Predictions Important • Predictions in different industry • Confidence Level and Data Sampling • Preprocessing • ML • Our Model • Sample Study 2 Contents
  3. 3. Increase costs in SW Testing Market Release Delays Experien ce Operatio nal risks Customer Satisfacti on on Quality Helps to predict the test suite Helps to improve the test estimation Helps to predict the Man days reduction Problem Benefits Predictive Modeling Benefits.. 3
  4. 4. Predictive Modeling Is..  Predictive modeling uses statistics to predict outcomes.  The goal is to go beyond knowing what has happened to providing a best assessment of what will happen in the future. Reports / Records Predict Results Review Predictions Past Current Future Action 4
  5. 5. Predictions
  6. 6. Confidence Level and Sample Size Confidence Level is more when analyze with more samples.
  7. 7. 5 Steps Processing Data Collection Data Cleaning Building Model Validation Predictions
  8. 8. Video Clip Source: YouTube
  9. 9. Your Predictions Source: Wikipedia Which movie will join 500 crores club?
  10. 10. Machine Learning Algorithms to Predict • Fast Training Time • Mostly Used Logistic Regression • More Accurate • Training Time is slow Artificial Neural Network • Fast Training Time • Pre-requisite of high memory footprint Decision Tree • Quicker than other alternatives • Highly feature dependent Naïve Bayes Classifier • Highly Accurate • Pre-requisite of 100+ independent variables Support Vector Machine • High Cost of computation • Applicability in diverse training set data K-Nearest Neighbors Selected for our model as Mostly used and Fast Training Time 10
  11. 11. Defects History Change List Module Mapping Code Changes ∑Weight based on priority or AI) Algorithms: Logistic Regression Optimized TCs Suite Our Prediction Model || Optimized Test Suite Input Training Model Output Release Info # New Features Test cases w1 w2 w3 w4 w5 w6 w7 x1 x2 x3 x4 x5 x6 x7 Raw Data Remove Errors Input for Training Model Data Set: 11
  12. 12. Input || X1: Defects Analysis • Code Changed Defects / Documents Considered • Defects Severity (High / Medium / Low) • Occur. Freq. • Defects Classification (Display, Fatal, Function, Performance, Text, Usability) • Resolved Option (Code Changes, UI/UX) Summation ∑ = Weighted Average Note: The defect which accepted from Development Team are considered. Invalid defects are not considered including 3rd party.12
  13. 13. Input || X2: Change List Information Considered • [Title] Feature Change / Defects Fix / Etc.. • [Checking Method] Steps to check • [Type & Feature Name] Feature Change • [Cause & Measure] Exception not handled • [Developer] Name • [Modules Affected] Module A and Module B Note: Change list data taken from Configuration Management Tool which submitted by Development Team Narrow ‘s Down the TCs Group Set selection Weightage evaluation Key Words been used to identify Module Fix info 13
  14. 14. Input || X3: Module Mapping Possible Scenarios: • No Interaction • Average Interaction • Less Interaction • More Interaction No interaction Consider Only A1 Average interaction Consider A2 + C2 + C3 Less interaction Consider A4 + C4 More interaction Consider A3 + C1 + C2 + C4 Note: Module Mapping done by an Experts or Class Diagram or by Code Coverage to find out the interaction between modules. Modules/ C Sub Modules C1 C2 C3 C4 A A1 X X X X A2 X Y Y X A3 Y Y X Y A4 X X X Y C2 & C4 are duplicate as these are considered along with A2 & A4 14
  15. 15. Lines of Code • Added • Deleted • Modified • Release version • Type of Releases (Sanity / Full / Patch) Release Information Input || X4: Code Changes Input || X5: Release Info Note: SLOC Tool used for calculating KLOC Changes Note: Release information and type taken from Internal Tools 15
  16. 16. TestcasesNo.of New Features Implemented • Impact of new features • Module Name • Sub Module Name • Priority • Title • Steps to Execute • Failed Test cases Input || X6: New Features Input || X7: Test Cases Note: PM and Development Team shared the new features Note: Test case data prepared by Test Team. 16
  17. 17. Release Cycle Regression Test Cycle (4200 Test cases) Manual Prediction Model A1 6300 2654 A2 6300 2250 A3 6300 1973 A4 6300 1275 A5 6300 960 Input Predicted Test Suite • Other than source code changes defects not predicted using above model (Ex; Document Related Defects (Requirements Documentation, UI) and 3rd Party defects. Ex: Text cut, document mismatch, dependent on 3rd party.. Non Predicted Defects Category • 3536 Defects • 2932 Change List • 43 Modules • 1245 KLOC Changes • 32 Releases • 5 New Features • 6300 Test cases Release Cycle Defects Identified Prediction Model Non - Prediction % Accuracy A1 415 353 54% A2 365 215 63% A3 300 130 70% A4 199 30 87% A5 144 20 89% 17 Sample Study
  18. 18. Prediction || Advantages Prediction Defect Prediction Test Estimation Man days Reduction Optimal Test Suite 18
  19. 19. Limitations..  History cannot always predict 100% future accurate.  The issue of unknown unknowns.  Self-defeat of an algorithm. 19  Defects Resolution comments from Developer  Module Mapping  New Features TC Optimization (LOC, etc.).. Challenges..
  20. 20. 20 Thanks to… & Participants
  21. 21. Thank You!!! 21
  22. 22. Algorithms
  23. 23. Error Metrics (To Predict More Accurate) True condition Type #Identification of TCs Condition positive Condition negative Predicted condition Predicted condition positive True positive, Power False positive, Type I error Predicted condition negative False negative, Type II error True negative X1 Xn W1 Wn ∑input OutputAF Activation Function Yes Error? No Re-train 23
  24. 24. testing/ References & Appendix 24