Black Box Software Testing

3,731 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,731
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
93
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Black Box Software Testing

  1. 1. Black Box Software Testing Part 15. Testing User Documentation Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 527
  2. 2. Copyright Notice These slides are distributed under the Creative Commons License. In brief summary, you may make and distribute copies of these slides so long as you give the original author credit and, if you alter, transform or build upon this work, you distribute the resulting work only under a license identical to this one. For the rest of the details of the license, see http://creativecommons.org/licenses/by- sa/2.0/legalcode. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 528
  3. 3. Documentation is an Express Warranty A warranty is a statement of fact, either articulated or implied by law, respecting the quality or character of the goods to be sold. Under the Uniform Commercial Code an express warranty is: 2-313(a) Any affirmation of fact or promise made by the seller to the buyer which relates to the goods and becomes part of the basis of the bargain . . . 2-313(b) Any description of the goods which is made part of the basis of the bargain . . . 2-313(c) Any sample or model which is made part of the basis of the bargain. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 529
  4. 4. Documentation is an Express Warranty You can’t disclaim an express warranty -- you are accountable for your claims. Uniform Commercial Code 2-316 (1): Words or conduct relevant to the creation of an express warranty and words or conduct tending to negate or limit warranty shall be construed whenever reasonable as consistent with each other; but . . . negation or limitation is inoperative to the extent that such construction is unreasonable. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 530
  5. 5. Black Box Testing: Testing Documentation Doc testing is important because: • Errors in the manual increase risks of legal liability. • Testing the documentation improves the reliability of the program. • The documentation may be your mainstream test plan and your most up-to-date specification. • Confusion in the manual reflects confusion in the program’s design. Refer to Testing Computer Software, Chapter 10 Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 531
  6. 6. Testing Documentation: What to Test • Verify every statement of fact and every reasonable implication • Check the placement and accuracy of figures • Audit the completeness of the manual (check that every feature is documented) Track errors in the documentation in a way that is normal for the Doc group. This probably doesn’t involve the bug tracking system (but put code/doc mismatches there). If you give back marked up manuscripts, keep photocopies of your markups. Check your corrections against the next circulating draft of the manual. On average, you will cover 4 pages per hour in a reasonably stable program. Your second pass will go more quickly because the program is in better shape, but it will still take several minutes per page because you will actually test every page. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 532
  7. 7. Testing Documentation: Things to Say • Your role is not editorial: you are not the authority on style and layout. • Keep your tone non-judgmental. • Point out upcoming changes (design changes, new error handling, data structures, etc.) that might affect the manual. Mark these in appropriate sections on the manuscript. • Name experts or references to consult when the writer is confused. • Suggest examples. • Point to useful existing data files (for examples). Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 533
  8. 8. Testing Documentation: Things to Say • When appropriate, you might do some writing. The writer might or might not use what you have written. You might write in two ways: » words that you think belong in the manual “as is” (you’re saying, “Here, say it like this and it will be right.”) » explanations that are background material for the writer. Maybe a rough draft of what she’ll write. • Note: the final review is a meeting just a few days before the book goes to the printer. If you have many small-detail late comments, offer the writer a chance to review them privately with you a few days before the review. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 534
  9. 9. Documentation Testing: On-Line Help • Contents of the help • Cross-reference jumps • Glossary lookups • Browse sequences • Graphic hotspots • Graphic display - color or resolution • Window size, e.g. compile at 1024x768 and display at 640x480 • Procedure sequences • Balloon help / tool tips • Index • Search • Context jumps • Error messages Refer to Testing Computer Software, Chapter 10 Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 535
  10. 10. Publisher Liability for Content-Related Errors Winter v. G.P. Putnam’s Sons, 938 F.2d 1033, (9th Circuit) 1991. Winter became seriously ill from picking and eating mushrooms after relying on The Encyclopedia of Mushrooms, published by Putnam. Putnam did not verify the material in the book and did not intentionally include the error. • Putnam was not liable for errors in the book. • Noted that Jeppesen cases consistently held publisher liable for information error when information published was to be used as a tool. • The court said that a software publisher might be liable for program that does not work as intended. ALM v. Van Nostrand Reinhold 480 NE.2d 1263, 1985. The Making of Tools. Plaintiff used it, and a tool shattered, causing injury. VNR not liable, but author of the book might be. Liability might also attach if the program provides professional services (such as tax preparation) and gives bad information. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 536
  11. 11. Warranties & Misrepresentations: What Must You Test? • Advertisements • Published specifications • Interviews • Box copy • Fax-backs • Manual • Help system • Warranty • Web pages • Readme • Advice given to customers on Internet, CompuServe or AOL Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 537
  12. 12. Black Box Software Testing Part 16. Managing Automated Software Testing Several of these slides were developed by Doug Hoffman or in co-authorship with Doug Hoffman for a course that we co-taught on software test automation. Many of the ideas in this presentation were presented and refined in Los Altos Workshops on Software Testing. LAWST 5 focused on oracles. Participants were Chris Agruss, James Bach, Jack Falk, David Gelperin, Elisabeth Hendrickson, Doug Hoffman, Bob Johnson, Cem Kaner, Brian Lawrence, Noel Nyman, Jeff Payne, Johanna Rothman, Melora Svoboda, Loretta Suzuki, and Ned Young. LAWST 1-3 focused on several aspects of automated testing. Participants were Chris Agruss, Tom Arnold, Richard Bender, James Bach, Jim Brooks, Karla Fisher, Chip Groder, Elizabeth Hendrickson, Doug Hoffman, Keith W. Hooper, III, Bob Johnson, Cem Kaner, Brian Lawrence, Tom Lindemuth, Brian Marick, Thanga Meenakshi, Noel Nyman, Jeffery E. Payne, Bret Pettichord, Drew Pritsker, Johanna Rothman, Jane Stepak, Melora Svoboda, Jeremy White, and Rodney Wilson. I’m indebted to James Whittaker, James Tierney, Harry Robinson, and Noel Nyman for additional explanations of stochastic testing. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 538
  13. 13. Overview • About Automated Software Testing • The regression testing paradigm • 19 common mistakes • 27 questions about requirements • Planning for short-term ROI • 6 sometimes successful architectures • Conclusions from LAWST • Breaking away from the regression paradigm and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 539
  14. 14. Overview In the mass-market software world, many efforts to automate testing have been expensive failures. Paradigms that dominate the Information Technology and DoD- driven worlds don’t apply well to the mass-market paradigm. Our risks are different. Solutions that are highly efficient for those environments are not at all necessarily good for mass-market products. The point of an automated testing strategy is to save time and money. Or to find more bugs or harder-to-find bugs. The strategy works if it makes you more effective (you find better and more-reproducible bugs) and more efficient (you find the bugs faster and cheaper). I wrote a lot of test automation code back in the late 1970’s and early 1980’s. (That’s why WordStar hired me as its Testing Technology Team Leader when I came to California in 1983.) Since 1988, I haven’t written a line of code. I’ve been managing other people, and then consulting, seeing talented folks succeed or fail when confronted with similar problems. I’m not an expert in automated test implementation, but I have strengths in testing project management, and how that applies to test automation projects. That level of analysis--the manager’s view of a technology and its risks/benefits as an investment--is the point of this section. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 540
  15. 15. About Automated Testing Source of test cases • Old • Intentionally new • Random new Size of test pool • Small • Large • Exhaustive Serial dependence among tests • Independent • Sequence is relevant and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 541
  16. 16. About Automated Testing Evaluation strategy • Comparison to saved result • Comparison to an oracle • Comparison to a computational or logical model • Comparison to a heuristic prediction. (NOTE: All oracles are heuristic.) • Crash • Diagnostic • State model and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 542
  17. 17. Issues Faced in A Typical Automated Test • What is being tested? • How is the test set up? • Where are the inputs coming from? • What is being checked? • Where are the expected results? • How do you know pass or fail? and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 543
  18. 18. The “Complete” Oracle Test Inputs Test Results Test Results Test Postcondition Data Precondition Data Oracle Postcondition Data Postcondition Precondition System Program State Program State Postcondition Under Program State Test Environmental Environmental Results Inputs Environmental Results Reprinted with permission of Doug Hoffman and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 544
  19. 19. Automated Software Test Functions • Automated test case/data generation • Test case design from requirements or code • Selection of test cases • Able to run two or more specified test cases • Able to run a subset of all the automated test cases • No intervention needed after launching tests • Automatically sets-up and/or records relevant test environment • Runs test cases • Captures relevant results • Compares actual with expected results • Reports analysis of pass/fail and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 545
  20. 20. Characteristics of “fully automated” tests • A set of tests is defined and will be run together. • No intervention needed after launching tests. • Automatically sets-up and/or records relevant test environment. • Obtains input from existing data files, random generation, or another defined source. • Runs test exercise. • Captures relevant results. • Evaluates actual against expected results. • Reports analysis of pass/fail. Not all automation is full automation. Partial automation can be very useful. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 546
  21. 21. Capabilities of Automation Tools Automated test tools combine a variety of capabilities. For example, GUI regression tools provide: • capture/replay for easy manual creation of tests • execution of test scripts • recording of test events • compare the test results with expected results • report test results Some GUI tools provide additional capabilities, but no tool does everything well. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 547
  22. 22. Capabilities of Automation Tools Here are examples of automated test tool capabilities: • Analyze source code for bugs • Design test cases • Create test cases (from requirements or code) • Generate test data • Ease manual creation of test cases • Ease creation/management of traceability matrix • Manage testware environment • Select tests to be run • Execute test scripts • Record test events • Measure software responses to tests (Discovery Functions) • Determine expected results of tests (Reference Functions) • Evaluate test results (Evaluation Functions) • Report and analyze results and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 548
  23. 23. Tools for Improving Testability by Providing Diagnostic Support • Hardware integrity tests. Example: power supply deterioration can look like irreproducible, buggy behavior. • Database integrity. Ongoing tests for database corruption, making corruption quickly visible to the tester. • Code integrity. Quick check (such as checksum) to see whether part of the code was overwritten in memory. • Memory integrity. Check for wild pointers, other corruption. • Resource usage reports: Check for memory leaks, stack leaks, etc. • Event logs. See reports of suspicious behavior. Probably requires collaboration with programmers. • Wrappers. Layer of indirection surrounding a called function or object. The automator can detect and modify incoming and outgoing messages, forcing or detecting states and data values of interest. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 549
  24. 24. Automation Design • Determine the goals of the automation • Determine the capabilities needed to achieve those goals • Select automation components • Set relationships between components • Identify locations of components and events • Sequence test events • Evaluate and report results of test events. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 550
  25. 25. The Regression Testing Paradigm The most commonly discussed automation approach: • create a test exercise • run it and inspect the output • if the program fails, report a bug and try again later • if the program passes the test, save the resulting outputs • in future tests, run the program and compare the output to the saved results. Report an exception whenever the current output and the saved output don’t match. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 551
  26. 26. Is this Really Automation? Analyze product -- human Woo-hoo! We Design test -- human really get the Run test 1st time -- human machine to do Evaluate results -- human a whole lot of Report 1st bug -- human our work! Save code -- human (Maybe, but not this way.) Save result -- human Document test -- human Re-run the test -- MACHINE Evaluate result -- MACHINE (plus human if there’s any mismatch) Maintain result -- human and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 552
  27. 27. Common Mistakes about Test Automation The paper (Avoiding Shelfware) lists 19 “Don’ts.” For example, Don’t expect to be more productive over the short term. • The reality is that most of the benefits from automation don’t happen until the second release. • It takes 3 to 10+ times the effort to create an automated test than to just manually do the test. Apparent productivity drops at least 66% and possibly over 90%. • Additional effort is required to create and administer automated test tools. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 553
  28. 28. GUI Automation is Expensive • Test case creation is expensive. Estimates run from 3-5 times the time to create and manually execute a test case (Bender) to 3-10 times (Kaner) to 10 times (Pettichord) or higher (LAWST). • You usually have to increase the testing staff in order to generate automated tests. Otherwise, how will you achieve the same breadth of testing? • Your most technically skilled staff are tied up in automation • Automation can delay testing, adding even more cost (albeit hidden cost.) • Excessive reliance leads to the 20 questions problem. (Fully defining a test suite in advance, before you know the program’s weaknesses, is like playing 20 questions where you have to ask all the questions before you get your first answer.) and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 554
  29. 29. GUI Automation Pays off Late • GUI changes force maintenance of tests » May need to wait for GUI stablization » Most early test failures are due to GUI changes • Regression testing has low power » Rerunning old tests that the program has passed is less powerful than running new tests » Old tests do not address new features • Maintainability is a core issue because our main payback is usually in the next release, not this one. and SQM, LLC. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 555
  30. 30. Test Automation is Programming Win NT 4 had 6 million lines of code, and 12 million lines of test code Common (and often vendor-recommended) design and programming practices for automated testing are appalling: • Embedded constants • No modularity • No source control • No documentation • No requirements analysis No wonder we fail Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 556
  31. 31. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 557
  32. 32. Requirements Analysis Automation requirements are not just about the software under test and its risks. To understand what we’re up to, we have to understand: • Software under test and its risks • The development strategy and timeframe for the software under test • How people will use the software • What environments the software runs under and their associated risks • What tools are available in this environment and their capabilities • The regulatory / required recordkeeping environment • The attitudes and interests of test group management. • The overall organizational situation and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 558
  33. 33. Requirements Analysis Requirement: “Anything that drives design choices.” The paper (Avoiding Shelfware) lists 27 questions. For example, Will the user interface of the application be stable or not? • Let’s analyze this. The reality is that, in many companies, the UI changes late. • Suppose we’re in an extreme case. Does that mean we cannot automate cost effectively? No. It means that we should do only those types of automation that will yield a fast return on investment. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 559
  34. 34. You Can Plan for Short Term ROI • Smoke testing • Configuration testing • Variations on a theme • Stress testing • Load testing • Life testing • Performance benchmarking • Other tests that extend your reach and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 560
  35. 35. Six Sometimes-Successful Automation Architectures • Quick & dirty • Equivalence testing • Framework • Data-driven • Application-independent data-driven • Real-time simulator with event logs and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 561
  36. 36. Quick and Dirty • Smoke tests • Configuration tests • Variations on a theme • Stress, load, or life testing and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 562
  37. 37. Equivalence Testing A/B comparison Random tests using an oracle Regression testing is the weakest form and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 563
  38. 38. Framework-Based Architecture Frameworks are code libraries that separate routine calls from designed tests. • modularity • reuse of components • hide design evolution of UI or tool commands • partial salvation from the custom control problem • independence of application (the test case) from user interface details (execute using keyboard? Mouse? API?) • important utilities, such as error recovery For more on frameworks, see Linda Hayes’ book on automated testing, Tom Arnold’s book on Visual Test, and Mark Fewster & Dorothy Graham’s book “Software Test Automation.” and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 564
  39. 39. Data-Driven Tests • Variables are data • Commands are data • UI is data • Program’s state is data • Test tool’s syntax is data and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 565
  40. 40. Data-Driven Architecture • In test automation, there are three interesting programs: • The software under test (SUT) • The automation tool that executes the automated test code • The test code (test scripts) that define the individual tests • From the point of view of the automation software, • The SUT’s variables are data • The SUT’s commands are data • The SUT’s UI is data • The SUT’s state is data • Therefore it is entirely fair game to treat these implementation details of the SUT as values assigned to variables of the automation software. • Additionally, we can think of the externally determined (e.g. determined by you) test inputs and expected test results as data. • Additionally, if the automation tool’s syntax is subject to change, we might rationally treat the command set as variable data as well. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 566
  41. 41. Data-Driven Architecture In general, we can benefit from separating the treatment of one type of data from another with an eye to: • optimizing the maintainability of each • optimizing the understandability (to the test case creator or maintainer) of the link between the data and whatever inspired those choices of values of the data • minimizing churn that comes from changes in the UI, the underlying features, the test tool, or the overlying requirements and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 567
  42. 42. Data-Driven Architecture: Calendar Example Imagine testing a calendar-making program. The look of the calendar, the dates, etc., can all be thought of as being tied to physical examples in the world, rather than being tied to the program. If your collection of cool calendars wouldn’t change with changes in the UI of the software under test, then the test data that define the calendar are of a different class from the test data that define the program’s features. – Define the calendars in a table. This table should not be invalidated across calendar program versions. Columns name features settings, each test case is on its own row. – An interpreter associates the values in each column with a set of commands (a test script) that execute the value of the cell in a given column/row. – The interpreter itself might use “wrapped” functions, i.e. make indirect calls to the automation tool’s built-in features. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 568
  43. 43. Data-Driven Architecture: Calendar Example This is a good design from the point of view of optimizing for maintainability because it separates out four types of things that can vary independently: • The descriptions of the calendars themselves come from real-world and can stay stable across program versions. • The mapping of calendar element to UI feature will change frequently because the UI will change frequently. The mappings (one per UI element) are written as short, separate functions that can be maintained easily. • The short scripts that map calendar elements to the program functions probably call sub-scripts (think of them as library functions) that wrap common program functions. Therefore a fundamental change in the program might lead to a modest change in the software under test. • The short scripts that map calendar elements to the program functions probably also call sub-scripts (think of them as library functions) that wrap functions of the automation tool. If the tool syntax changes, maintenance involves changing the wrappers’ definitions rather than the scripts. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 569
  44. 44. Data Driven Architecture Note with this example: • we didn’t run tests twice • we automated execution, not evaluation • we saved SOME time • we focused the tester on design and results, not execution. Other table-driven cases: • automated comparison can be done via a pointer in the table to the file • the underlying approach runs an interpreter against table entries. » Hans Buwalda and others use this to create a structure that is natural for non-tester subject matter experts to manipulate. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 570
  45. 45. Application-Independent Data-Driven • Generic tables of repetitive types • Rows for instances • Automation of exercises and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 571
  46. 46. Real-time Simulator • Test embodies rules for activities • Stochastic process • Possible monitors • Code assertions • Event logs • State transition maps • Oracles and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 572
  47. 47. Think About: • Automation is software development. • Regression automation is expensive and can be inefficient. • Automation need not be regression--you can run new tests instead of old ones. • Maintainability is essential. • Design to your requirements. • Set management expectations with care. and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 573
  48. 48. GUI Regression Strategies: Some Papers of Interest Chris Agruss, Automating Software Installation Testing James Bach, Test Automation Snake Oil Hans Buwalda, Testing Using Action Words Hans Buwalda, Automated testing with Action Words: Abandoning Record & Playback Elisabeth Hendrickson, The Difference between Test Automation Failure and Success Cem Kaner, Avoiding Shelfware: A Manager’s View of Automated GUI Testing John Kent, Advanced Automated Testing Architectures Bret Pettichord, Success with Test Automation Bret Pettichord, Seven Steps to Test Automation Success Keith Zambelich, Totally Data-Driven Automated Testing and SQM, LLC. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 574
  49. 49. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 575
  50. 50. Black Box Software Testing Part 17 Test Strategy Planning Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 576
  51. 51. Test Strategy “How we plan to cover the product so as to develop an adequate assessment of quality.” A good test strategy is: • Diversified • Specific • Practical • Defensible Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 577
  52. 52. Test Strategy • Makes use of test techniques. • May be expressed by test procedures and cases. • Not to be confused with test logistics, which involve the details of bringing resources to bear on the test strategy at the right time and place. • You don’t have to know the entire strategy in advance. The strategy can change as you learn more about the product and its problems. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 578
  53. 53. Test Cases/Procedures • Test cases and procedures should manifest the test strategy. • If your strategy is to “execute the test suite I got from Joe Third-Party”, how does that answer the prime strategic questions: • How will you cover the product and assess quality? • How is that practical and justified with respect to the specifics of this project and product? • If you don’t know, then your real strategy is that you’re trusting things to work out. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 579
  54. 54. Diverse Half-Measures • There is no single technique that finds all bugs. • We can’t do any technique perfectly. • We can’t do all conceivable techniques. Use “diverse half-measures”-- lots of different points of view, approaches, techniques, even if no one strategy is performed completely. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 580
  55. 55. Heuristics from James Bach’s Test Plan Evaluation Model, www.satisfice.com 1. Look for the important problems first 2. Focus mainly on areas of potential technical risk 3. Address configuration, operation , observation, and evaluation of the product 4. Diversify test techniques and perspectives 5. Specify test data design and generation 6. Don’t just run preplanned tests 7. Test against stated and implied requirements 8. Collaborate with outside people for testing ideas Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 581
  56. 56. Heuristics from James Bach’s Test Plan Evaluation Model, www.satisfice.com 9. Work with developers to improve product testability 10. Highlight the non-routine, project-specific aspects of the testing strategy and the testing project 11. Use people and tools for what they do best 12. Identify dependencies in the testing schedule 13. Keep testing off the critical path for release 14. Make the feedback loop with development short 15. Learn what people outside of testing think of the product quality 16. Review all test materials Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 582
  57. 57. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 583
  58. 58. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 1. Testing should be optimized to find The later in the project that a important problems fast, rather than problem is found, the greater the attempting to find all problems with risk that it will not be safely fixed in equal urgency. time to ship. The sooner a problem is found after it is created, the lesser the risk of a bad fix. 2. Test strategy should focus most Complete testing is impossible, and effort on areas of potential technical we can never know if our perception risk, while still putting some effort of technical risk is completely into low risk areas just in case the accurate. risk analysis is wrong. 3. Test strategy should address test Sloppiness or neglect within any of platform configuration, how the these four basic testing activities product will be operated, how the will increase the likelihood that product will be observed, and how important problems will go observations will be used to evaluate undetected. the product. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 584
  59. 59. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 4. Test strategy should be diversified No single test technique can reveal all in terms of test techniques and important problems in a linear perspectives. Methods of evaluating fashion. We can never know for sure test coverage should take into if we have found all the problems that account multiple dimensions of matter. Diversification minimizes the coverage, including structural, risk that the test strategy will be blind functional, data, platform, operations, to certain kinds of problems. and requirements. Use diverse half-measures to go after low-hanging fruit. 5. The test strategy should specify It is common for the test strategy to how test data will be designed and be organized around functionality or generated. code, leaving it to the testers to concoct test data on the fly. Often that indicates that the strategy is too focused on validating capability and not focused enough on reliability. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 585
  60. 60. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 6. Not all testing should be pre- A rigid test strategy may make it more specified in detail. The test strategy likely that a particular subset of problems should incorporate reasonable will be uncovered, but in a complex system variation and make use of the it reduces the likelihood that all important testers’ ability to use situational problems will be uncovered. Reasonable reasoning to focus on important, variability in testing, such as that which but unanticipated problems. results from interactive, exploratory testing, increases incidental test coverage, without substantially sacrificing essential coverage. 7. It is important to test against Testing only against explicit written implied requirements—the full requirements will not reveal all important extent of what the requirements problems, since defined requirements are mean, not just what they say. generally incomplete and natural language is inherently ambiguous. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 586
  61. 61. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 8. The test project should promote Other teams and stakeholders often collaboration with all other have information about product functions of the project, especially problems or potential problems that developers, technical support, and can be of use to the test team. Their technical writing. Whenever perspective may help the testers possible, testers should also make a better analysis of risk. collaborate with actual customers Testers may also have information and users, in order to better that is of use to them. understand their requirements. 9. The test project should consult The likelihood that a test strategy with development to help them will serve its purpose is profoundly build a more testable product. affected by the testability of the product. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 587
  62. 62. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 10. A test plan should highlight the Virtually every software project non-routine, project-specific worth doing involves special aspects of the test strategy and technical challenges that a good test project. test effort must take into account. A completely generic test plan usually indicates a weak test planning process. It could also indicate that the test plan is nothing but unchanged boilerplate. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 588
  63. 63. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 11. The test project should use Many test projects suffer under the humans for what humans do well false belief that human testers are and use automation for what effective when they use exactingly automation does well. Manual specified test scripts, or that test testing should allow for automation duplicates the value of improvisation and on the spot human cognition in the test critical thinking, while automated execution process. Manual and testing should be used for tests automated testing are not two forms that require high repeatability, high of the same thing. They are two speed, and no judgment. entirely different classes of test technique. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 589
  64. 64. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 12. The test schedule should be A monolithic test schedule in a test represented and justified in such a plan often indicates the false belief way as to highlight any that testing is an independent dependencies on the progress of activity. The test schedule can stand development, the testability of the alone only to the extent that the product, time required to report product the highly testable, problems, and the project team’s development is complete, and the assessment of risk. test process is not interrupted by the frequent need to report problems. 13. The test process should be This is important in order to deflect kept off of the critical path to the pressure to truncate the testing extent possible. This can be done process. by testing in parallel with development work, and finding problems worth fixing faster than the developers fix them. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 590
  65. 65. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 14. The feedback loop between This is important in order to maximize testers and developers should be the efficiency and speed of quality as tight as possible. Test cycles improvement. It also helps keep testing should be designed to provide off of the critical path. rapid feedback to developers about recent additions and changes they have made before a full regression test is commenced. Whenever possible testers and developers should work physically near each other. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 591
  66. 66. Heuristics for Test Plan Evaluation Heuristic Basis for the Heuristic 15. The test project should employ By examining product quality channels of information about information gathered through various quality other than formal testing in means beyond the test team, blind order to help evaluate and adjust spots in the formal test strategy can be the test project. Examples of these uncovered. channels are inspections, field testing, or informal testing by people outside of the test team. 16. All documentation related to Tunnel-vision is the great occupational the test strategy, including test hazard of testing. Review not only helps cases and procedures, should be to reveal blind spots in test design, but undergo review by someone other it can also help promote dialog and peer than the person who wrote them. education about test practices. The review process used should be commensurate with the criticality of the document. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 592
  67. 67. Evaluating Your Plan: Context Free Questions Based on: The CIA’s Phoenix Checklists (Thinkertoys, p. 140) and Bach’s Evaluation Strategies (Rapid Testing Course notes) • Can you solve the whole problem? Part of the problem? • What would you like the resolution to be? Can you picture it? • How much of the unknown can you determine? • What reference data are you using (if any)? • What product output will you evaluate? • How will you do the evaluation? • Can you derive something useful from the information you have? • Have you used all the information? • Have you taken into account all essential notions in the problem? • Can you separate the steps in the problem-solving process? Can you determine the correctness of each step? • What creative thinking techniques can you use to generate ideas? How many different techniques? • Can you see the result? How many different kinds of results can you see? • How many different ways have you tried to solve the problem? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 593
  68. 68. Evaluating Your Plan: Context Free Questions • What have others done? • Can you intuit the solution? Can you check the results? • What should be done? • How should it be done? • Where should it be done? • When should it be done? • Who should do it? • What do you need to do at this time? • Who will be responsible for what? • Can you use this problem to solve some other problem? • What is the unique set of qualities that makes this problem what it is and none other? • What milestones can best mark your progress? • How will you know when you are successful? • How conclusive and specific is your answer? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 594
  69. 69. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 595
  70. 70. Black Box Software Testing Part 18. Planning Testing Projects Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 596
  71. 71. Planning and Negotiating the Testing Project • The notes here focus on your process of negotiating resources and quality level. • Please see Testing Computer Software, Chapter 13 for a detailed discussion of planning and testing tasks across the time line of the project. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 597
  72. 72. Early Planning (1) Things you can do very early in the project: • Analyze any requirements documents for testability, ambiguity. • Facilitate inspection and walkthrough meetings. • Prepare a preliminary list of hardware configurations and start arranging for loaners. • Ask for testing support code, such as debug monitors, memory meters, support for testpoints. • Ask for a clear error handling message structure. • Discuss the possibility of code coverage measurement (which will require programmer support). Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 598
  73. 73. Early Planning (2) Things you can do very early in the project: • Prepare for test automation. This involves reaching agreements in principle on breadth of automation and automation support staffing level. • Order automation support equipment. • Order external test suites. • Learn about the product’s market and competition. • Evaluate coding tools that facilitate automation (e.g. test 3rd party custom controls against MS-Test) Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 599
  74. 74. First Principles in Planning • You can’t nail everything down. • You will face difficult prioritization choices, and many constraints will be out of your direct control. • You can influence several constraints by opening up your judgments to other stakeholders in the company. This is more than getting a sign-off. • Reality is far more important than your ability to cast blame. • Your task is to manage project-level risks. This includes risks to the project’s cost, schedule, and interpersonal dynamics, as well as to the product’s feature set and reliability. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 600
  75. 75. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 601
  76. 76. Deciding Your Depth of Testing You have three key objectives : 1 Achieve a uniform and well-understood minimum level of testing. 2 Be explicit about the level of testing for each area: » mainstream » exploratory » formally planned » structured regression 3 Reach corporate agreement on the level of testing for each area Just like bug deferrals, depth-of-testing restrictions must rest on sound business decisions. This should not be your decision to make. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 602
  77. 77. Identify the Testing Tasks Develop a project task list with your staff. Remember, you are looking for realism in estimates of complexity, difficulty, and time. • Make the listing and estimation process public, and allow public comment. • Identify all potential sources of information. • List all main functional areas, and all other cross-functional approaches that you will take in testing the program. • List every task that appears to need more than 1/2 day (Probably you will do this by a group brainstorming session on flipcharts, listing sources on one chart. Gather the sources. List areas and approaches on a few charts. Make one chart for each area of work, and list tasks for each chart. Tentatively assign times to each task, possibly by a museum tour.) Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 603
  78. 78. Estimate the Project Time 1 Assign time estimates to each task. Invite programmers, marketers, and project managers to walk through the charts and provide their own estimates. Explore the bases of differences. You might have underestimated a task’s complexity. Or you might be seeing a priority disagreement. Or wishful thinking. 2 Make your best-estimate task list. Provide the time needed to achieve formal, planned testing for each area. Include estimates for structured regression for key areas. This number is probably absurdly huge. 3 Add to the list your suggestions for time-cuts to guerrila-level and mainstream-level testing for selected areas. 4 Circulate your lists to all stakeholders, call one or more planning meetings, and reach agreements on the level of testing for each area. Keep cutting tasks and/or adding staff until you reach an achievable result. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 604
  79. 79. Estimate the Project Time Don’t forget: • Budget for meetings and administrative overhead (10-30%). • Budget for bug regression (5-15%). • Budget for down time. • Budget for holidays and vacations. • Never develop a budget that relies on overtime. Leave the overtime free for discretionary additional work by the individual tester. • Don’t let yourself be bullied or embarrassed, and don’t bully or embarrass your staff, into making underestimates. To achieve a result, insist on cutting tasks, adding staff, or finding realistic ways to improve productivity. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 605
  80. 80. Identify Dependencies & Milestones You can’t start reviewing the manual until you receive it, right? List these dependencies and point out, in every project meeting, what is due to you that is not yet delivered and holding you up. Try to reach verifiable, realistic definitions for each milestone. For example, if “gamma” is 6 weeks before release, the product can’t have more than 6 weeks of schedule risk at gamma. Negotiate a definition. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 606
  81. 81. Monitor Progress • Put the tasks and agreed-to durations on a timeline and review progress of all staff every week. • When prerequisite materials aren’t provided, find other tasks to do instead. When you can’t meet the schedule, due to absent materials, publish new estimates. • When design changes invalidate your estimates, publish new estimates. • If you’re falling consistently behind (e.g. due to underestimated overhead), publish the problem and ask for fewer tasks, more staff, or some other realistic solution. • If one of your staff has trouble with an area, recognize it and deal with it. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 607
  82. 82. Control Late-Stage Issues • Watch out for late changes. Encourage people to be cautious about late bug fixes, and super- cautious about late design changes. • Provide interim and final deferred-bug reviews. • Take a final inventory of your testing coverage. • Carry out final integrity tests. • You may have to assess and reporting the reliability of the tested product. • Plan for post-release processes. Develop a release kit for product support. • Don’t forget the party. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 608
  83. 83. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 609
  84. 84. Black Box Software Testing Part 19. Metrics and Measurement Theory Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 610
  85. 85. We Aren’t Collecting Data. Capers Jones, Patterns of Software Systems Failure & Success: The number one root cause of cancellations, schedule slippages, and cost overruns is the chronic failure of the software industry to collect accurate historical data from ongoing and completed projects. This failure means that the vast majority of major software projects are begun without anyone having a solid notion of how much time will be required. Software is perhaps the only technical industry where neither clients, managers, nor technical staff have any accurate quantitative data available to them from similar projects when beginning major construction activities. . . . A result that is initially surprising but quite common across the industry is to discover that the software management community within a company knows so little about the technology of software planning and estimating that they do not even know of the kinds of tools that are commercially available. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 611
  86. 86. Question Imagine being on the job. Your local PBH (pointy-haired boss) drops in and asks: “So, how much testing have you gotten done?” Please write down an answer. Feel free to use fictitious numbers but (except for the numbers) try to be realistic in terms of the type of information that you would provide. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 612
  87. 87. How do we Measure Extent of Testing? Before we can measure something, we need some sense of what we’re measuring. It’s easy to come up with “measurements” but we have to understand the relationship between the thing we want to measure and the statistic that we calculate to “measure” it. If we want to measure the “extent of testing”, we have to start by understanding what we mean by “extent of testing.” Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 613
  88. 88. What is measurement? Measurement is the assignment of numbers to objects or events according to a rule derived from a model or theory. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 614
  89. 89. The Question is Remarkably Ambiguous Common answers are based on the: Product • We’ve tested 80% of the lines of code. Plan • We’ve run 80% of the test cases. Results • We’ve discovered 593 bugs. Effort • We’ve worked 80 hours a week on this for 4 months. We’ve run 7,243 tests. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 615
  90. 90. The Question is Remarkably Ambiguous Common answers are based on the: Obstacles • We’ve been plugging away but we can’t be efficient until X, Y, and Z are dealt with. Risks • We’re getting a lot of complaints from beta testers and we have 400 bugs open. The product can’t be ready to ship in three days. Quality of • Beta testers have found 30 bugs that we Testing missed. Our regression tests seem ineffective. History • At this milestone on previous projects, we across had fewer than 12.3712% of the bugs found projects still open. We should be at that percentage on this product too. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 616
  91. 91. A Framework for Measurement A measurement involves at least 10 factors: • Attribute to be measured » appropriate scale for the attribute » variation of the attribute • Instrument that measures the attribute » scale of the instrument » variation of measurements made with this instrument • Relationship between the attribute and the instrument • Likely side effects of using this instrument to measure this attribute • Purpose • Scope Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 617
  92. 92. Framework for Measurement Attribute • Extent of testing – What does that mean? Instrument • What should we count? Lines? Bugs? Test cases? Hours? Temper tantrums? Mechanism • How will increasing “extent of testing” affect the reading (the measure) on the instrument? Side Effect • If we do something that makes the measured result look better, will that mean that we’ve actually increased the extent of testing? Purpose • Why are we measuring this? What will we do with the number? Scope • Are we measuring the work of one tester? One team on one project? Is this a cross-project metrics effort? Cross-departmental research? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 618
  93. 93. Simple measurement You have a room full of tables that appear to be the same length. You want to measure their lengths. You have a one-foot ruler. You use the ruler to measure the lengths of a few tables. You get: • 6.01 feet • 5.99 feet • 6.05 feet You conclude that the tables are “6 feet” long. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 619
  94. 94. Simple measurement (2) Note the variation • Measurement errors using the ruler • Manufacturing variation in the tables Note the rule: • We are relying on a direct matching operation and on some basic axioms of mathematics » The sum of 6 one-foot ruler-lengths is 6. » A table that is 6 ruler-lengths long is twice as long as one that is 3 ruler-lengths long. These rules don’t always apply. What do we do when we have something hard to measure? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 620
  95. 95. Attributes and Instruments Length Ruler Duration Stopwatch Speed Ruler / Stopwatch Sound energy Sound level meter Loudness Sound level comparisons by humans Tester goodness ??? Bug count ??? Code complexity ??? Branches ??? Extent of testing??? ??? ----Product coverage ?? Count statements / branches tested ?? ----Proportion of bugs ?? Count bug reports or graph bug found curves?? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 621
  96. 96. Surrogate measures "Many of the attributes we wish to study do not have generally agreed methods of measurement. To overcome the lack of a measure for an attribute, some factor which can be measured is used instead. This alternate measure is presumed to be related to the actual attribute with which the study is concerned. These alternate measures are called surrogate measures." Mark Johnson’s MA Thesis “Surrogates” provide unambiguous assignments of numbers according to rules, but they don’t provide an underlying theory or model that relates the measure to the attribute allegedly being measured. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 622
  97. 97. Consider bug counts • Do bug counts measure testers? • Do bug counts measure thoroughness of testing? • Do bug counts measure the effectiveness of an automation effort? • Do bug counts measure how near we are to being ready to ship the product? How would we know? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 623
  98. 98. Bug counts and testers To evaluate an instrument that is supposed to measure an attribute, we have to ask two key questions: • What underlying mechanism, or fundamental relationship, justifies the use of the reading we take from this instrument as a measure of the attribute? If the attribute increases by 20%, what will happen to the reading? • What can we know from the instrument reading? How tightly is the reading traceable to the underlying attribute? If the reading increases by 20%, does this mean that the attribute has increased 20%. If the linkage is not tight, we risk serious side effects as people push the reading (the “measurement”) up and down without improving the underlying attribute. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 624
  99. 99. Bug counts and testers: mechanism? Suppose we could improve testing by 20%. This might mean that: • We find more subtle bugs that are important but that require more thorough investigation and analysis • We create bug reports that are more thorough, better researched, more descriptive of the problem and therefore more likely to yield fixes. • We do superb testing of a critical area that turns out to be relatively stable. The bug counts might even go down, even though tester goodness has gone up. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 625
  100. 100. Bug Counts & Testers: Side effects What if you could increase the count of reported bugs by 20%? If you reward testers for higher bug counts, won’t you make changes like these more likely? • Testers report easier-to-find, more superficial bugs • Testers report multiple instances of the same bug • Programmers dismiss design bugs as non-bugs that testers put in the system to raise their bug counts • No one will work on the bug tracking system or other group infrastructure. • Testers become less willing to spend time coaching other testers. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 626
  101. 101. Bug counts and extent of testing? What Is This Curve? Bugs Per Week Week Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 627
  102. 102. Bug counts and extent of testing Attribute • Not sure. Maybe we’re thinking of percentage found of the total population of bugs in this product. Instrument • Bugs found. (Variations: bugs found this week, etc., various numbers based on bug count.) Mechanism • If we increase the extent of testing, does that result in more bug reports? Not necessarily. Side Effect • If we change testing to maximize the bug count, does that mean we’ve achieved more of the testing? Maybe in a trivial sense, but what if we’re finding lots of simple bugs at the expense of testing for a smaller number of harder-to-find serious bugs? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 628
  103. 103. Bug curves and extent of testing Attribute • We have a model of the rate at which new bugs will be found over the life of the project. Instrument • Bugs per week. A key thing that we look at is the agreement between the predictive curve and the actual bug counts. Mechanism • As we increase the extent of testing, will our bug numbers conform to the curve? Not necessarily. It depends on the bugs that are left in the product. Side Effect • If we do something that makes the measured result look better, will that mean that we’ve actually increased the extent of testing? No, no, no. See side effect discussion. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 629
  104. 104. Side effects of bug curves Earlier in testing: (Pressure is to increase bug counts) • Run tests of features known to be broken or incomplete. • Run multiple related tests to find multiple related bugs. • Look for easy bugs in high quantities rather than hard bugs. • Less emphasis on infrastructure, automation architecture, tools and more emphasis of bug finding. (Short term payoff but long term inefficiency.) Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 630
  105. 105. Side effects of bug curves Later in testing: (Pressure is to decrease new bug rate) • Run lots of already-run regression tests • Don’t look as hard for new bugs. • Shift focus to appraisal, status reporting. • Classify unrelated bugs as duplicates • Class related bugs as duplicates (and closed), hiding key data about the symptoms / causes of the problem. • Postpone bug reporting until after the measurement checkpoint (milestone). (Some bugs are lost.) • Report bugs informally, keeping them out of the tracking system • Testers get sent to the movies before measurement checkpoints. • Programmers ignore bugs they find until testers report them. • Bugs are taken personally. • More bugs are rejected. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 631
  106. 106. Bug curve counterproductive? Shouldn't We Strive For This ? Bugs Per Week Week Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 632
  107. 107. Code coverage Coverage measures the amount of testing done of a certain type. Since testing is done to find bugs, coverage is a measure of your effort to detect a certain class of potential errors: • 100% line coverage means that you tested for every bug that can be revealed by simple execution of a line of code. • 100% branch coverage means you will find every error that can be revealed by testing each branch. • 100% coverage should mean that you tested for every possible error. This is obviously impossible. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 633
  108. 108. Benefits of coverage Before I attack coverage measures, let me acknowledge that they are often useful. • Many projects achieve low statement coverage, as little as 2% at one well-known publisher that had done (as measured by tester-hours) extensive testing and test automation. The results from checking statement coverage caused a complete rethinking of the company’s automation strategy. • Coverage tools can point to unreachable code or to code that is active but persistently untested. Coverage tools can provide powerful diagnostic information about the testing strategy, even if they are terrible measures of the extent of testing. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 634
  109. 109. Analysis: Statement/Branch Coverage Attribute • Extent of testing – How much of the product have we tested? Instrument • Count statements and branches tested Mechanism • If we do more testing and find more bugs, does that mean that our line count will increase? Not necessarily. Example—configuration tests. Side Effect • If we design our tests to make sure we hit more lines, does that mean we’ll have done more extensive testing? Maybe in a trivial sense, but we can achieve this with weaker tests that find fewer bugs. Purpose • Not specified Scope • Not specified Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 635
  110. 110. Statement / branch coverage just test the flowchart You’re not testing:: » data flow » tables that determine control flow in table-driven code » side effects of interrupts, or interaction with background tasks » special values, such as boundary cases. These might or might not be tested. » unexpected values (e.g. divide by zero) » user interface errors » timing-related bugs » compliance with contracts, regulations, or other requirements » configuration/compatibility failures » volume, load, hardware faults Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 636
  111. 111. If we use “coverage”? • If we improve testing by 20%, does this result in a 20% increase in “coverage”? Does it necessarily result in ANY increase in “coverage”? • If we increase “coverage” by 20%, does this mean that there was a 20% improvement in the testing? • If we achieve 100% “coverage”, do we really think we’ve found all the bugs? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 637
  112. 112. Side effects and “coverage” • Without a mechanism that ties changes in the attribute being measured to changes in the reading we get from the instrument, we have a “measure” that is ripe for abuse. • People will optimize what is tracked. If you track “coverage”, the coverage number will go up, but (as Marick has often pointed out) the quality of testing might go down. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 638
  113. 113. Having framed the problem . . . Do I have a valid, useful measure of the extent of testing? • Nope, not yet. • The development and validation of a field’s measures takes time. In the meantime, what do we do? • We still have to manage projects, monitor progress, make decisions based on what’s going on - - - so ignoring the measurement question is not an option. • Let’s look at strategies of some seasoned test managers and testers. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 639
  114. 114. One approach: Balanced scorecard “Coverage” measures are popular because they provide management with essential (if incorrect) feedback about the progress of testing. Rather than reporting a single not-very-representative measure of testing progress, consider adopting a “balanced scorecard” approach. Report: • a small number (maybe 10) of different measures, • none of them perfect, • all of them different from each other, and • all of them reporting progress that is meaningful to you. Together, these show a pattern that might more accurately reflect progress. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 640
  115. 115. One Approach: Balanced Scorecard • For 101 examples of possible coverage measures, that might be suitable for balancing, see “Software Negligence and Testing Coverage” at www.kaner.com. These are merged in a list with over 100 additional indicators of extent of testing in the paper, “Measurement Issues & Software Testing”, which is included in the proceedings. • Robert Austin criticizes even this balanced scorecard approach. It will still lead to abuse, if the measures don’t balance each other out. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 641
  116. 116. “Extent” as a Multidimensional Problem We developed the 8 aspects (or dimensions) of “extent of testing” by looking at the types of measures of extent of testing that we were reporting. The accompanying paper reports a few hundred “measures” that fit in the 8 categories • product coverage - plan / agreement • effort - results • obstacles - risks • quality of testing - project history ALL of them have problems Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 642
  117. 117. “Extent” as a Multidimensional Problem So, look at testing progress reports actually used in the field. Do we see focus on these individual measures? • Often, NOT. • Instead, we see reports that show a pattern of information of different types, to give the reader / listener a sense of the overall flow of the testing project. • Several examples in the paper, here are two of them: » Hendrickson’s component map » Bach’s project dashboard Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 643
  118. 118. Project Report / Component Map (Hendrickson) Page 1 --- Issues that need management attention Page 2 --- Component map Page 3 --- Bug statistics Component Test Tester Total Tests Time Time Projected Notes Type Tests Passed / Budget Spent for Next Planned / Failed / Build Created Blocked We see in this report: - Progress against plan - Obstacles / Risks - Effort - Results Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 644
  119. 119. Bach’s Dashboard Testing Dashboard Updated Build 11/1/00 32 Area Effort Coverage Coverage Quality Comments Planned Achieved File/edit High High Low K 1345, 1410 View Low Med Med J Insert Blocked Med Low L 1621 We see coverage of areas, progress against plan, current effort, key results and risks, and obstacles. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 645
  120. 120. Suggested Lessons • Bug count metrics cover only a narrow slice of testing progress. There are LOTS of alternatives. • Simple charts can carry a lot of useful information and lead you to a lot of useful questions. • Report multidimensional patterns, rather than single measures or a few measures along the same line. • Think carefully about the potential side effects of your measures. • Listen critically to reports (case studies) of success with simple metrics. If you can’t see the data and don’t know how the data were actually collected, you might well be looking at results that were sanitized by working staff (as a side effect of the imposition of the measurement process). Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 646
  121. 121. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 647
  122. 122. Black Box Software Testing Part 20. Defect Life Cycles Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 648
  123. 123. Why Track Defects? • Get things that should be fixed, fixed • Capture valuable information • Identify responsibilities • Provide data about quality • Focal point for decision making Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 649
  124. 124. Tasks In Defect Tracking • Quickly notify appropriate people • Avoid forgetting about errors • Assure important issues get resolved • Minimize errors going unfixed due to poor communication Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 650
  125. 125. What Are We Tracking? • Symptoms of possible errors • Severity of possible errors • Priority for fixing errors • Events related to the reported error • No duplicate reports (for single error) • Responsibilities • Resolutions Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 651
  126. 126. Forces Acting on Defect Tracking • Severity and Priority • Features and Quality • Duplicate reports • Project status reporting • Project management / Developer tasking • Metrics • Issue tracking and incident reporting • Irreproducible problems • Enhancement requests • Design errors • Deferring problems • Branches and release versions Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 652
  127. 127. Problem of Similar Bugs You File a You Ignor It Defect Report It's a Never Fixed It Gets Fixed New Defect (Consumer Risk) Same Old Waste of Time Gets Fixed Defect (Producer Risk) Anyway Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 653
  128. 128. Defect Tracking Activities •Submitting •Assigning •Fixing •Verifying •Resolving •Administering Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 654
  129. 129. Example Defect Life Cycle Issue Entered Perceived Submitted Accepted Not an issue, Duplicate, Assigned Will not fix, Fix Later Not fixed Cannot reproduce, Finished Withdrawn It’s Later Deferred Fixed Fix verified Closed Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 655
  130. 130. Defect Resolutions • Common resolutions include: • Pending: the bug is still being worked on. • Fixed: the programmer says it’s fixed. Now you should check it. • Cannot reproduce: The programmer can’t make the failure happen. You must add details, reset the resolution to Pending, and notify the programmer. • Deferred: It’s a bug, but we’ll fix it later. • As Designed: The program works as it’s supposed to. • Need Info: The programmer needs more info from you. She has probably asked a question in the comments. • Duplicate: This is just a repeat of another bug report (XREF it on this report.) Duplicates should not close until the duplicated bug closes. • Withdrawn: The tester who reported this bug is withdrawing the report. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 656
  131. 131. Notes ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ ___________________________________________________________ Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 657
  132. 132. Project Planning Part 21. Testing Impossible Software Under Impossible Deadlines Cem Kaner, Software Testing Analysis East, 1999 Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 658
  133. 133. Introduction I didn’t come up with the idea for this talk--the STAR folk suggested it as something that I might be able to provide a few insights into. After all, this is a chronic complaint among testers. . . . Well, here goes. How should you deal with a situation in which the software comes to you undocumented on a tight deadline? I think that the answer depends on the causes of the situation that you’re in. Your best solution might be technical or political or resume-ical. It depends on how you and the company got into this mess. If it is a mess. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 659
  134. 134. What Caused This Situation? So, let’s start with some questions about causation: • Why is the software undocumented? • Why do you have so little time? • What are the quality issues for this customer? Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 660
  135. 135. Why Do You Have So Little Time? A Few Possibilities • Time-to-release drives competition • Cash flow drives release date • Key fiscal milestone drives release date • Executive belief that you’ll never be satisfied, so your schedule input is irrelevant • Executive belief that testing group is out of control and needs to be controlled by tight budgets • Executive belief that you have the wrong priorities (e.g. paperwork rather than bugs) • Executive belief that testing is irrelevant • Structural lack of concern about the end customer, or • Maybe you have an appropriate amount of time. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 661
  136. 136. Quality Issues For This Customer? • In-house, “In-house” outsourced • External, custom • External, large package, packaged • External, mass-market, packaged • What is quality in this market? What are their costs of failure? » User groups, Software stores, Sales calls, Support calls Over the long term, a good understanding of quality in the market, and as it affects your company’s costs, will buy you credibility and authority. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 662
  137. 137. Political Approaches: Buying Time Many of the ideas in this section will help you if you’re dealing with a single project that is out of control. If your problem is structural (reflects policy or standard practice of the company), then some of these ideas will be counter-productive. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 663
  138. 138. Buying Time: Delaying a Premature Release -2 Take the high road • “I want to see knives in the chests, not in the backs.” » Trip Hawkins, founding President, Electronic Arts • Communicate honestly. • Avoid springing surprises on people. • Never sabotage the project. • Don’t become The Adversary: If you are nasty and Adversary personal, you will become a tool, to be used by someone else. And you will be disposable. Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 664
  139. 139. Buying Time: Delaying a Premature Release -3 • Search for show-stoppers. If you can, dedicate a specialist within your group to this task. • Circulate deferred bug lists broadly. • Consider writing mid-project and late-in- project assessments of: » extent of testing (by area) » extent of defects (by area, with predictions about level of bugs left) » deferred bugs » likely out of box experience or reviewers’ experience Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 665
  140. 140. Buying Time: Delaying a Premature Release -4 Do regular, effective status reporting. List: • your group’s issues (deliverables due to you, things slowing or blocking your progress, etc., areas in which you are behind or ahead of plan) • project issues • bug statistics Find allies (think in terms of quality-related costs) Copyright © 1994-2004 Cem Kaner and SQM, LLC. All Rights Reserved. 666

×