Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bad Testing Metrics—and What To Do About Them

587 views

Published on

Published in: Technology, Business
  • How we discovered the real reason nice guys don't get laid, and a simple "fix" that allows you to gain the upper hand with a girl... without changing your personality or pretending to be someone you're not. learn more... ■■■ http://scamcb.com/unlockher/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • My special guest's 3-Step "No Product Funnel" can be duplicated to start earning a significant income online. ▲▲▲ https://tinyurl.com/y3ylrovq
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Stop getting scammed by online, programs that don't even work! ◆◆◆ https://tinyurl.com/y4urott2
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Bad Testing Metrics—and What To Do About Them

  1. 1. Bad Metrics and What You Can Do About It Paul Holland Test Consultant and Teacher at Testing Thoughts
  2. 2. My Background • Independent S/W Testing consultant since Apr 2012 • 16+ years testing telecommunications equipment and reworking test methodologies at Alcatel-Lucent • 10+ years as a test manager • Presenter at STAREast and CAST • Keynote at KWSQA conference in 2012 • Facilitator at 25+ peer conferences and workshops • Teacher of S/W testing for the past 5 years • Teacher of Rapid Software Testing – through Satisfice (James Bach): www.satisfice.com • Military Helicopter pilot – Canadian Sea Kings April, 2013 ©2013 Testing Thoughts 2
  3. 3. Attributions • Over the past 10 years I have spoken with many people regarding metrics. I cannot directly attribute any specific aspects of this talk to any individual but all of these people (and more) have influenced my opinions and thoughts on metrics: – Cem Kaner, James Bach, Michael Bolton, Ross Collard, Doug Hoffman, Scott Barber, John Hazel, Eric Proegler, Dan Downing, Greg McNelly, Ben Yaroch April, 2013 ©2013 Testing Thoughts 3
  4. 4. Definitions of METRIC (from http://www.merriam-webster.com, April 2012) • 1 plural : a part of prosody that deals with metrical structure • 2 : a standard of measurement <no metric exists that can be applied directly to happiness — Scientific Monthly> • 3 : a mathematical function that associates a real nonnegative number analogous to distance with each pair of elements in a set such that the number is zero only if the two elements are identical, the number is the same regardless of the order in which the two elements are taken, and the number associated with one pair of elements plus that associated with one member of the pair and a third element is equal to or greater than the number associated with the other member of the pair and the third element April, 2013 ©2013 Testing Thoughts 4
  5. 5. Sample Metrics • Number of Test Cases Planned (per release or feature) • Number of Test Cases Executed vs. Plan • Number of Bugs Found per Tester • Number of Bugs Found per Feature • Number of Bugs Found in the Field • Number of Open Bugs • Lab Equipment Usage April, 2013 ©2013 Testing Thoughts 5
  6. 6. Sample Metrics • Hours between crashes in the Field • Percentage Behind Plan • Percentage of Automated Test Cases • Percentage of Tests Passed vs. Failed (pass rate) • Number of Test Steps • Code Coverage / Path Coverage • Requirements Coverage April, 2013 ©2013 Testing Thoughts 6
  7. 7. Goodhart‘s Law • In 1975, Charles Goodhart, a former advisor to the Bank of England and Emeritus Professor at the London School of Economics stated: Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes Goodhart, C.A.E. (1975a) ‗Problems of Monetary Management: The UK Experience‘ in Papers in Monetary Economics, Volume I, Reserve Bank of Australia, 1975 April, 2013 ©2013 Testing Thoughts 7
  8. 8. Goodhart‘s Law • Professor Marilyn Strathern FBA has re-stated Goodhart's Law more succinctly and more generally: `When a measure becomes a target, it ceases to be a good measure.' April, 2013 ©2013 Testing Thoughts 8
  9. 9. Elements of Bad Metrics • Measure and/or compare elements that are inconsistent in size or composition – Impossible to effectively use for comparison – How many containers do you need for your possessions? – Test Cases and Test Steps • Greatly vary in time required and complexity – Bugs • Can be different severity, likelihood - i.e.: risk April, 2013 ©2013 Testing Thoughts 9
  10. 10. Elements of Bad Metrics • Create competition between individuals and/or teams – They typically do not result in friendly competition – Inhibits sharing of information and teamwork – Especially damaging if compensation is impacted – Number of xxxx per tester – Number of xxxx per feature April, 2013 ©2013 Testing Thoughts 10
  11. 11. Elements of Bad Metrics • Easy to ―game‖ or circumvent the desired intention – Easy to be improved by undesirable behaviour – Pass rate (percentage): Execute more simple tests that will pass or break up a long test case into many smaller ones – Number of bugs raised: Raising two similar bug reports instead of combining them April, 2013 ©2013 Testing Thoughts 11
  12. 12. Elements of Bad Metrics • Contain misleading information or gives a false sense of completeness – Summarizing a large amount of information into one or two numbers out of context – Coverage (Code, Path) • Misleading information based on touching the code once – Pass rate and number of test cases April, 2013 ©2013 Testing Thoughts 12
  13. 13. Impact of Using Bad Metrics – Promotes bad behaviour: • Testers may create more smaller test cases instead of creating test cases that make sense • Execution of ineffective testing to meet requirements • Artificially creating higher numbers instead of doing what makes sense • Creation of tools that will mask inefficiencies (e.g.: lab equipment usage) • Time wasted improving the ―numbers‖ instead of improving the testing April, 2013 ©2013 Testing Thoughts 13
  14. 14. Impact of Using Bad Metrics • Gives Executives a false sense of test coverage – All they see is numbers out of context – The larger the numbers the better the testing – The difficulty of good testing is hidden by large ―fake‖ numbers • Dangerous message to Executives – Our pass rate is at 96% so our product is in good shape – Code coverage is at 100% - our code is completely tested – Feature specification coverage is at 100% - Ship it!!! • What could possibly go wrong? April, 2013 ©2013 Testing Thoughts 14
  15. 15. Sample Metrics • Number of Test Cases Planned (per release or feature) • Number of Test Cases Executed vs. Plan • Number of Bugs Found per Tester • Number of Bugs Found per Feature • Number of Bugs Found in the Field – A list of Bugs • Number of Open Bugs – A list of Open Bugs • Lab Equipment Usage April, 2013 ©2013 Testing Thoughts 15
  16. 16. Sample Metrics • Hours between crashes in the Field • Percentage Behind Plan – depends if plan is flexible • Percentage of Automated Test Cases • Percentage of Tests Passed vs. Failed (pass rate) • Number of Test Steps • Code Coverage / Path Coverage – depends on usage • Requirements Coverage – depends on usage April, 2013 ©2013 Testing Thoughts 16
  17. 17. So … Now what? • I have to stop counting everything. I feel naked and exposed. • Track expected effort instead of tracking test cases using: – Whiteboard – Excel spreadsheet April, 2013 ©2013 Testing Thoughts 17
  18. 18. Whiteboard • Used for planning and tracking of test execution • Suitable for use in waterfall or agile (as long as you have control over your own team‘s process) • Use colours to track: – Features, or – Main Areas, or – Test styles (performance, robustness, system) April, 2013 ©2013 Testing Thoughts 18
  19. 19. Whiteboard • Divide the board into four areas: – – – – Work to be done Work in Progress Cancelled or Work not being done Completed work • Red stickies indicate issues (not just bugs) • Create a sticky note for each half day of work (or mark # of half days expected on the sticky note) • Prioritize stickies daily (or at least twice/wk) • Finish ―on-time‖ with low priority work incomplete April, 2013 ©2013 Testing Thoughts 19
  20. 20. Sticky Notes • All of these items are optional – add your own elements Use what makes sense to your situation – – – – – – Charter Title (or Test Case Title) Estimated Effort Feature area Tester name Date complete Effort (# of sessions or half days of work) • Initially, estimated -> replace with actual April, 2013 ©2013 Testing Thoughts 20
  21. 21. Actual Sample Sticky Charter Title Tester Area Effort April, 2013 ©2013 Testing Thoughts 21
  22. 22. Whiteboard Example April, 2013 ©2013 Testing Thoughts 22
  23. 23. Reporting • An Excel Spreadsheet with: – – – – – – – – – – List of Charters Area Estimated Effort Expended Effort Remaining Effort Tester(s) Start Date Completed Date Issues Comments • Does NOT include pass/fail percentage or number of test cases April, 2013 ©2013 Testing Thoughts 23
  24. 24. Sample Report Charter Area Estimated Expended Remaining Effort Effort Effort Tester Date Issues Date Started Completed Found Comments Lots of investigation. Problem was on 2-3 out of ALU01617 48 ports which just happened to be 2 of the 6 01/14/2012 032 ports I tested. Investigation for high QLN spikes on EVLT H/W Performance 0 20 0 acode ARQ Verification under different RA Modes ARQ 2 2 0 ncowan 12/14/2011 12/15/2011 POTS interference ARQ 2 0 0 --- 01/08/2012 01/08/2012 Expected throughput testing ARQ 5 5 0 acode 01/10/2012 01/14/2012 INP vs. SHINE ARQ 6 6 0 ncowan 12/01/2011 12/04/2011 INP vs. REIN ARQ 6 INP vs. REIN + SHINE ARQ 12 Traffic delay and jitter from RTX ARQ 2 Attainable Throughput April, 2013 ARQ 1 7 5 jbright 12/10/2011 01/06/2012 01/10/2012 Decided not to test as the H/W team already tested this functionality and time was tight. To translate the files properly, had to install Python solution from Antwerp. Some overhead to begin testing (installation, config test) but was fairly quick to execute afterwards 12 2 4 0 0 ncowan 12/05/2011 12/05/2011 jbright 01/05/2012 01/08/2012 ©2013 Testing Thoughts Took longer because was not behaving as expected and I had to make sure I was testing correctly. My expectations were wrong based on virtual noise not being exact. 24
  25. 25. Sample Report "Awesome Product" Test Progress as of 02/01/2012 Effort (person half days) 90 80 Original Planned Effort 70 Expended Effort 60 Total Expected Effort 50 40 30 20 10 0 ARQ SRA Vectoring Regression H/W Performance Feature April, 2013 ©2013 Testing Thoughts 25
  26. 26. April, 2013 ©2013 Testing Thoughts 26

×