• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Advanced software test_lucanhquan90
 

Advanced software test_lucanhquan90

on

  • 4,512 views

 

Statistics

Views

Total Views
4,512
Views on SlideShare
4,435
Embed Views
77

Actions

Likes
0
Downloads
209
Comments
0

1 Embed 77

http://testingbaires.com 77

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Advanced software test_lucanhquan90 Advanced software test_lucanhquan90 Document Transcript

    • About the AuthorsWith over a quarter-century of software and systemsengineering experience, Rex Black is President of RexBlack Consulting Services (www.rbcs-us.com), a leader insoftware, hardware, and systems testing. RBCS deliversconsulting, outsourcing, and training services, employingthe industry’s most experienced and recognizedconsultants. RBCS worldwide clientele save time andmoney through improved product development,decreased tech support calls, improved corporatereputation, and more. Rex is the most prolific author practicing in the field ofsoftware testing today. His popular first book, Managingthe Testing Process, has sold over 40,000 copies around theworld, and is now in its third edition. His six other books—Advanced Software Testing: Volumes I, II, and III, CriticalTesting Processes, Foundations of Software Testing, andPragmatic Software Testing—have also sold tens ofthousands of copies. He has written over thirty articles;presented hundreds of papers, workshops, and seminars;and given about fifty keynote and other speeches atconferences and events around the world. Rex is theimmediate past President of the International SoftwareTesting Qualifications Board (ISTQB) and a Director of theAmerican Software Testing Qualifications Board (ASTQB).Jamie L. Mitchell has over 28 years of experience indeveloping and testing both hardware and software. He isa pioneer in the test automation field and has workedwith a variety of vendor and open-source test automationtools since the first Windows tools were released withWindows 3.0. He has also written test tools for severalplatforms. Jamie specializes in increasing the productivity ofautomation and test teams through innovative ideas andcustom tool extensions. In addition, he provides training,mentoring, process auditing, and expert technicalsupport in all aspects of testing and automation. Jamie holds a Master of Computer Science degreefrom Lehigh University in Bethlehem, PA, and a CertifiedSoftware Test Engineer certification from QAI. He was aninstructor and board member of the InternationalInstitute of Software Testing (IIST) and a contributingeditor, technical editor, and columnist for the Journal ofSoftware Testing Professionals. He has been a frequentspeaker on testing and automation at severalinternational conferences, including STAR, QAI, and PSQT.
    • Rex Black · Jamie L. MitchellAdvanced SoftwareTesting—Vol. 3Guide to the ISTQB Advanced Certificationas an Advanced Technical Test Analyst
    • Rex Blackrex_black@rbcs-us.comJamie L. Mitchelljamie@go-tac.comEditor: Dr. Michael BarabasProjectmanager: Matthias RossmanithCopyeditor: Judy FlynnLayout and Type: Josef HegeleProofreader: James JohnsonCover Design: Helmut Kraus, www.exclam.dePrinter: CourierPrinted in USAISBN: 978-1-933952-39-01st Edition © 2011 by Rex Black and Jamie L. Mitchell16 15 14 13 12 11 1 2 3 4 5Rocky Nook802 East Cota Street, 3rd FloorSanta Barbara, CA 93103www.rockynook.comLibrary of Congress Cataloging-in-Publication DataBlack, Rex, 1964- Advanced software testing : guide to the ISTQB advanced certification as an advancedtechnical test analyst / Rex Black, Jamie L. Mitchell.-1st ed. p. cm.-(Advanced software testing)ISBN 978-1-933952-19-2 (v. 1 : alk. paper)-ISBN 978-1-933952-36-9 (v. 2 : alk. paper)1. Electronic data processing personnel-Certification. 2. Computer software-Examinations-Study guides. 3. Computer software-Testing. I. Title. QA76.3.B548 2008 005.14-dc222008030162Distributed by O’Reilly Media1005 Gravenstein Highway NorthSebastopol, CA 95472-2811All product names and services identified throughout this book are trademarks or registeredtrademarks of their respective companies. They are used throughout this book in editorialfashion only and for the benefit of such companies. No such uses, or the use of any tradename, is intended to convey endorsement or other affiliation with the book. No part of thematerial protected by this copyright notice may be reproduced or utilized in any form,electronic or mechanical, including photocopying, recording, or bay any information storageand retrieval system, without written permission from the copyright owner.This book is printed on acid-free paper.
    • vRex Black’s AcknowledgementsA complete list of people who deserve thanks for helping me along in my careeras a test professional would probably make up its own small book. Here I’ll con-fine myself to those who had an immediate impact on my ability to write thisparticular book. First of all, I’d like to thank my colleagues on the American Software TestingQualifications Board and the International Software Testing QualificationsBoard, and especially the Advanced Syllabus Working Party, who made thisbook possible by creating the process and the material from which this bookgrew. Not only has it been a real pleasure sharing ideas with and learning fromeach of the participants, but I have had the distinct honor of being elected pres-ident of both the American Software Testing Qualifications Board and theInternational Software Testing Qualifications Board twice. I spent two terms ineach of these roles, and I continue to serve as a board member, as the ISTQBGovernance Officer, and as a member of the ISTQB Advanced Syllabus Work-ing Party. I look back with pride at our accomplishments so far, I look forwardwith pride to what we’ll accomplish together in the future, and I hope this bookserves as a suitable expression of the gratitude and professional pride I feeltoward what we have done for the field of software testing. Next, I’d like to thank the people who helped us create the material thatgrew into this book. Jamie Mitchell co-wrote the materials in this book, ourAdvanced Technical Test Analyst instructor-lead training course, and ourAdvanced Technical Test Analyst e-learning course. These materials, alongwith related materials in the corresponding Advanced Test Analyst book andcourses, were reviewed, re-reviewed, and polished with hours of dedicatedassistance by José Mata, Judy McKay, and Pat Masters. In addition, James Nazar,Corne Kruger, John Lyerly, Bhavesh Mistry, and Gyorgy Racz provided usefulfeedback on the first draft of this book. The task of assembling the e-learning
    • vi Rex Black’s Acknowledgements and live courseware from the constituent bits and pieces fell to Dena Pauletti, RBCS’ extremely competent and meticulous systems engineer. Of course, the Advanced syllabus could not exist without a foundation, spe- cifically the ISTQB Foundation syllabus. I had the honor of working with that Working Party as well. I thank them for their excellent work over the years, cre- ating the fertile soil from which the Advanced syllabus and thus this book sprang. In the creation of the training courses and the materials that I contributed to this book, I have drawn on all the experiences I have had as an author, practi- tioner, consultant, and trainer. So, I have benefited from individuals too numer- ous to list. I thank those of you who have bought one of my previous books, for you contributed to my skills as a writer. I thank those of you who have worked with me on a project, for you have contributed to my abilities as a test manager, test analyst, and technical test analyst. I thank those of you who have hired me to work with you as a consultant, for you have given me the opportunity to learn from your organizations. I thank those of you who have taken a training course from me, for you have collectively taught me much more than I taught each of you. I thank my readers, colleagues, clients, and students, and hope that my contributions to you have repaid the debt of gratitude that I owe you. For over a dozen years, I have run a testing services company, RBCS. From humble beginnings, RBCS has grown into an international consulting, training, and outsourcing firm with clients on six continents. While I have remained a hands-on contributor to the firm, over 100 employees, subcontractors, and business partners have been the driving force of our ongoing success. I thank all of you for your hard work for our clients. Without the success of RBCS, I could hardly avail myself of the luxury of writing technical books, which is a source of great pride but not a whole lot of money. Again, I hope that our mutual suc- cesses together have repaid the debt of gratitude that I owe each of you. Finally, I thank my family, especially my wife, Laurel, and my daughters, Emma and Charlotte. Tolstoy was wrong: It is not true that all happy families are exactly the same. Our family life is quite hectic, and I know I miss a lot of it thanks to the endless travel and work demands associated with running a global testing services company and writing books. However, I’ve been able to enjoy seeing my daughters grow up as citizens of the world, with passports given to them before their first birthdays and full of stamps before they started losing
    • Jamie Mitchell’s Acknowledgements viitheir baby teeth. Laurel, Emma, and Charlotte, I hope the joys of Decemberbeach sunburns in the Australian summer sun of Port Douglas, learning to skiin the Alps, hikes in the Canadian Rockies, and other experiences that frequentflier miles and an itinerant father can provide, make up in some way for the lim-ited time we share together. I have won the lottery of life to have such a wonder-ful family.Jamie Mitchell’s AcknowledgementsWhat a long, strange trip its been. The last 25 years has taken me from being abench technician, fixing electronic audio components, to this time and place,where I have cowritten a book on some of the most technical aspects of softwaretesting. Its a trip that has been both shared and guided by a host of people that Iwould like to thank. To the many at both Moravian College and Lehigh University who startedme off in my “Exciting Career in Computers”: for your patience and leadershipthat instilled in me a burning desire to excel, I thank you. To Terry Schardt, who hired me as a developer but made me a tester, thanksfor pushing me to the dark side. To Tom Mundt and Chuck Awe, who gave mean incredible chance to lead, and to Barindralal Pal, who taught me that to leadwas to keep on learning new techniques, thank you. To Dean Nelson, who first asked me to become a consultant, and LarryDecklever, who continued my training, many thanks. A shout-out to Beth andJan, who participated with me in choir rehearsals at Joe Sensers when thingswere darkest. Have one on me. To my colleagues at TCQAA, SQE, and QAI who gave me chances todevelop a voice while I learned how to speak, my heartfelt gratitude. To the peo-ple I am working with at ISTQB and ASTQB: I hope to be worthy of the honor
    • viii Jamie Mitchell’s Acknowledgements of working with you and expanding the field of testing. Thanks for the opportu- nity. In my professional life, I have been tutored, taught, mentored, and shown the way by a host of people whose names deserve to be mentioned, but they are too abundant to recall. I would like to give all of you a collective thanks; I would be poorer for not knowing you. To Rex Black for giving me a chance to coauthor the Advanced Technical Test Analyst course and this book: Thank you for your generosity and the opportunity to learn at your feet. For my partner in crime, Judy McKay: Even though our first tool attempt did not fly, I have learned a lot from you and appreciate both your patience and kindness. Hoist a fruity drink from me. To Laurel, Dena, and Michelle: Your patience with me is noted and appreciated. Thanks for being there. And finally, to my family, who have seen so much less of me over the last 25 years than they might have wanted, as I strove to become all that I could be in my chosen profession: words alone cannot convey my thanks. To Beano, who spent innumerable hours helping me steal the time needed to get through school and set me on the path to here, my undying love and gratitude. To my loving wife, Susan, who covered for me at many of the real-life tasks while I toiled, trying to climb the ladder, my love and appreciation. I might not always remember to say it, but I do think it. And to my kids, Christopher and Kim- berly, who have always been smarter than me but allowed me to pretend that I was the boss of them, thanks. Your tolerance and enduring support have been much appreciated. Last, and probably least, to “da boys,” Baxter and Boomer, Bitzi and Buster. Whether sitting in my lap while I was trying to learn how to test or sitting at my feet while writing this book, you guys have been my sanity check. You never cared how successful I was, as long as the doggie-chow appeared in your bowls, morning and night. Thanks.
    • Contents ixContentsRex Black’s Acknowledgements vJamie Mitchell’s Acknowledgements viiIntroduction xix1 Test Basics 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2 Testing in the Software Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21.3 Specific Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71.4 Metrics and Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111.5 Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .141.6 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .162 Testing Processes 192.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192.2 Test Process Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .202.3 Test Planning and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .212.4 Test Analysis and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 2.4.1 Non-functional Test Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 2.4.2 Identifying and Documenting Test Conditions . . . . . . . . . . . . . .25 2.4.3 Test Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 2.4.4 Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31 2.4.5 Static Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34 2.4.6 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .352.5 Test Implementation and Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36 2.5.1 Test Procedure Readiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 2.5.2 Test Environment Readiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39 2.5.3 Blended Test Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
    • x Contents 2.5.4 Starting Test Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.5.5 Running a Single Test Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.5.6 Logging Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5.7 Use of Amateur Testers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.5.8 Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.5.9 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.6 Evaluating Exit Criteria and Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.6.1 Test Suite Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.6.2 Defect Breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.6.3 Confirmation Test Failure Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.6.4 System Test Exit Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.6.5 Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.6.6 Evaluating Exit Criteria and Reporting Exercise . . . . . . . . . . . . 60 2.6.7 System Test Exit Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.6.8 Evaluating Exit Criteria and Reporting Exercise Debrief . . . . . 63 2.7 Test Closure Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 2.8 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3 Test Management 69 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.2 Test Management Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.3 Test Plan Documentation Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.4 Test Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.5 Scheduling and Test Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6 Test Progress Monitoring and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.7 Business Value of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.8 Distributed, Outsourced, and Insourced Testing . . . . . . . . . . . . . . . . . . 74 3.9 Risk-Based Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.9.1 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.9.2 Risk Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.9.3 Risk Analysis or Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.9.4 Risk Mitigation or Risk Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.9.5 An Example of Risk Identification and Assessment Results . 87
    • Contents xi 3.9.6 Risk-Based Testing throughout the Lifecycle . . . . . . . . . . . . . . . .89 3.9.7 Risk-Aware Testing Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 3.9.8 Risk-Based Testing Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92 3.9.9 Risk-Based Testing Exercise Debrief 1 . . . . . . . . . . . . . . . . . . . . . . .93 3.9.10 Project Risk By-Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95 3.9.11 Requirements Defect By-Products . . . . . . . . . . . . . . . . . . . . . . . . . .95 3.9.12 Risk-Based Testing Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96 3.9.13 Risk-Based Testing Exercise Debrief 2 . . . . . . . . . . . . . . . . . . . . . . .96 3.9.14 Test Case Sequencing Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . .973.10 Failure Mode and Effects Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .973.11 Test Management Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .983.12 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .984 Test Techniques 1014.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1024.2 Specification-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.2.1 Equivalence Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.2.1.1 Avoiding Equivalence Partitioning Errors . . . . . . . . . 110 4.2.1.2 Composing Test Cases with Equivalence Partitioning . . . . . . . . . . . . . . . . . . . . 111 4.2.1.3 Equivalence Partitioning Exercise . . . . . . . . . . . . . . . . 115 4.2.1.4 Equivalence Partitioning Exercise Debrief . . . . . . . . . 116 4.2.2 Boundary Value Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.2.2.1 Examples of Equivalence Partitioning and Boundary Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.2.2.2 Non-functional Boundaries . . . . . . . . . . . . . . . . . . . . . . 123 4.2.2.3 A Closer Look at Functional Boundaries . . . . . . . . . . 124 4.2.2.4 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4.2.2.5 Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.2.2.6 Testing Floating Point Numbers . . . . . . . . . . . . . . . . . . 130 4.2.2.7 How Many Boundaries? . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4.2.2.8 Boundary Value Exercise . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.2.2.9 Boundary Value Exercise Debrief . . . . . . . . . . . . . . . . . 135
    • xii Contents 4.2.3 Decision Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 4.2.3.1 Collapsing Columns in the Table . . . . . . . . . . . . . . . . . 143 4.2.3.2 Combining Decision Table Testing with Other Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 4.2.3.3 Nonexclusive Rules in Decision Tables . . . . . . . . . . . . 147 4.2.3.4 Decision Table Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 148 4.2.3.5 Decision Table Exercise Debrief . . . . . . . . . . . . . . . . . . 149 4.2.4 State-Based Testing and State Transition Diagrams . . . . . . . 154 4.2.4.1 Superstates and Substates . . . . . . . . . . . . . . . . . . . . . . . 161 4.2.4.2 State Transition Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 4.2.4.3 Switch Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 4.2.4.4 State Testing with Other Techniques . . . . . . . . . . . . . 169 4.2.4.5 State Testing Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 4.2.4.6 State Testing Exercise Debrief . . . . . . . . . . . . . . . . . . . . 172 4.2.5 Requirements-Based Testing Exercise . . . . . . . . . . . . . . . . . . . . 175 4.2.6 Requirements-Based Testing Exercise Debrief . . . . . . . . . . . . 175 4.3 Structure-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 4.3.1 Control-Flow Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 4.3.1.1 Building Control-Flow Graphs . . . . . . . . . . . . . . . . . . 180 4.3.1.2 Statement Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.3.1.3 Decision Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 4.3.1.4 Loop Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 4.3.1.5 Hexadecimal Converter Exercise . . . . . . . . . . . . . . . . 195 4.3.1.6 Hexadecimal Converter Exercise Debrief . . . . . . . . 197 4.3.1.7 Condition Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 4.3.1.8 Decision/Condition Coverage . . . . . . . . . . . . . . . . . . . 200 4.3.1.9 Modified Condition/Decision Coverage (MC/DC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 4.3.1.10 Multiple Condition Coverage . . . . . . . . . . . . . . . . . . . 205 4.3.1.11 Control-Flow Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 209 4.3.1.12 Control-Flow Exercise Debrief . . . . . . . . . . . . . . . . . . 210 4.3.2 Path Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 4.3.2.1 LCSAJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
    • Contents xiii 4.3.2.2 Basis Path/Cyclomatic Complexity Testing . . . . . . . . 220 4.3.2.3 Cyclomatic Complexity Exercise . . . . . . . . . . . . . . . . . 225 4.3.2.4 Cyclomatic Complexity Exercise Debrief . . . . . . . . . . 225 4.3.3 A Final Word on Structural Testing . . . . . . . . . . . . . . . . . . . . . . . 227 4.3.4 Structure-Based Testing Exercise . . . . . . . . . . . . . . . . . . . . . . . . . 228 4.3.5 Structure-Based Testing Exercise Debrief . . . . . . . . . . . . . . . . 2294.4 Defect- and Experience-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 4.4.1 Defect Taxonomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 4.4.2 Error Guessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 4.4.3 Checklist Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 4.4.4 Exploratory Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 4.4.4.1 Test Charters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 4.4.4.2 Exploratory Testing Exercise . . . . . . . . . . . . . . . . . . . . . 249 4.4.4.3 Exploratory Testing Exercise Debrief . . . . . . . . . . . . . . 249 4.4.5 Software Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 4.4.5.1 An Example of Effective Attacks . . . . . . . . . . . . . . . . . . 256 4.4.5.2 Other Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 4.4.5.3 Software Attack Exercise . . . . . . . . . . . . . . . . . . . . . . . . . 259 4.4.5.4 Software Attack Exercise Debrief . . . . . . . . . . . . . . . . . 259 4.4.6 Specification-, Defect-, and Experience-Based Exercise . . . . 260 4.4.7 Specification-, Defect-, and Experience-Based Exercise Debrief . . . . . . . . . . . . . . . . . . . 260 4.4.8 Common Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2614.5 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 4.5.1 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 4.5.2 Code Parsing Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 4.5.3 Standards and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 4.5.4 Data-Flow Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 4.5.5 Set-Use Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 4.5.6 Set-Use Pair Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 4.5.7 Data-Flow Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 4.5.8 Data-Flow Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 4.5.9 Data-Flow Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
    • xiv Contents 4.5.10 Static Analysis for Integration Testing . . . . . . . . . . . . . . . . . . . . 288 4.5.11 Call-Graph Based Integration Testing . . . . . . . . . . . . . . . . . . . . . 290 4.5.12 McCabe Design Predicate Approach to Integration Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 4.5.13 Hex Converter Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 4.5.14 McCabe Design Predicate Exercise . . . . . . . . . . . . . . . . . . . . . . . 301 4.5.15 McCabe Design Predicate Exercise Debrief . . . . . . . . . . . . . . . 301 4.6 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 4.6.1 Memory Leak Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 4.6.2 Wild Pointer Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 4.6.3 API Misuse Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 4.7 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 5 Tests of Software Characteristics 323 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 5.2 Quality Attributes for Domain Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 5.2.1 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 5.2.2 Suitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 5.2.3 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 5.2.4 Usability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 5.2.5 Usability Test Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 5.2.6 Usability Test Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 5.3 Quality Attributes for Technical Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 337 5.3.1 Technical Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 5.3.2 Security Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 5.3.3 Timely Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 5.3.4 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 5.3.5 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 5.3.6 Multiple Flavors of Efficiency Testing . . . . . . . . . . . . . . . . . . . . . 357 5.3.7 Modeling the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 5.3.8 Efficiency Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 5.3.9 Examples of Efficiency Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 5.3.10 Exercise: Security, Reliability, and Efficiency . . . . . . . . . . . . . . . 372
    • Contents xv 5.3.11 Exercise: Security, Reliability, and Efficiency Debrief . . . . . . . 372 5.3.12 Maintainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 5.3.13 Subcharacteristics of Maintainability . . . . . . . . . . . . . . . . . . . . . 379 5.3.14 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 5.3.15 Maintainability and Portability Exercise . . . . . . . . . . . . . . . . . . 393 5.3.16 Maintainability and Portability Exercise Debrief . . . . . . . . . . . 3935.4 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3966 Reviews 3996.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3996.2 The Principles of Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4036.3 Types of Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4076.4 Introducing Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4126.5 Success Factors for Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 6.5.1 Deutsch’s Design Review Checklist . . . . . . . . . . . . . . . . . . . . . . . 417 6.5.2 Marick’s Code Review Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . 419 6.5.3 The OpenLaszlo Code Review Checklist . . . . . . . . . . . . . . . . . . 4226.6 Code Review Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4236.7 Code Review Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4246.8 Deutsch Checklist Review Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4296.9 Deutsch Checklist Review Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . 4306.10 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4327 Incident Management 4357.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4357.2 When Can a Defect Be Detected? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4367.3 Defect Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4377.4 Defect Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4457.5 Metrics and Incident Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4497.6 Communicating Incidents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4507.7 Incident Management Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4517.8 Incident Management Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . . 4527.9 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
    • xvi Contents 8 Standards and Test Process Improvement 457 9 Test Techniques 459 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 9.2 Test Tool Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 9.2.1 The Business Case for Automation . . . . . . . . . . . . . . . . . . . . . . . 461 9.2.2 General Test Automation Strategies . . . . . . . . . . . . . . . . . . . . . . 466 9.2.3 An Integrated Test System Example . . . . . . . . . . . . . . . . . . . . . . 471 9.3 Test Tool Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 9.3.1 Test Management Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 9.3.2 Test Execution Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 9.3.3 Debugging, Troubleshooting, Fault Seeding, and Injection Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 9.3.4 Static and Dynamic Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . 480 9.3.5 Performance Testing Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483 9.3.6 Monitoring Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 9.3.7 Web Testing Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 9.3.8 Simulators and Emulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488 9.4 Keyword-Driven Test Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 9.4.1 Capture/Replay Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 9.4.2 Capture/Replay Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . 495 9.4.3 Evolving from Capture/Replay . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 9.4.4 The Simple Framework Architecture . . . . . . . . . . . . . . . . . . . . . 499 9.4.5 Data-Driven Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 9.4.6 Keyword-Driven Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504 9.4.7 Keyword Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 9.4.8 Keyword Exercise Debrief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512 9.5 Performance Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514 9.5.1 Performance Testing Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520 9.5.2 Performance Testing Exercise Debrief . . . . . . . . . . . . . . . . . . . . 521 9.6 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
    • Contents xvii10 People Skills and Team Composition 52710.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52810.2 Individual Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52810.3 Test Team Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52810.4 Fitting Testing within an Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 52910.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52910.6 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53010.7 Sample Exam Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53211 Preparing for the Exam 53511.1 Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 11.1.1 Level 1: Remember (K1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536 11.1.2 Level 2: Understand (K2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536 11.1.3 Level 3: Apply (K3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 11.1.4 Level 4: Analyze (K4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538 11.1.5 Where Did These Levels of Learning Objectives Come From? . . . . . . . . . . . . . . . . . . . . . . 53811.2 ISTQB Advanced Exams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 11.2.1 Scenario-Based Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 11.2.2 On the Evolution of the Exams . . . . . . . . . . . . . . . . . . . . . . . . . . . 543Appendix A – Bibliography 545 11.2.3 Advanced Syllabus Referenced Standards . . . . . . . . . . . . . . . . 545 11.2.4 Advanced Syllabus Referenced Books . . . . . . . . . . . . . . . . . . . . 545 11.2.5 Other Referenced Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 11.2.6 Other References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547Appendix B – HELLOCARMSThe Next Generation of Home Equity Lending 549System Requirements Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549I Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551II Versioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553III Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
    • xviii Contents 000 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557 001 Informal Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558 003 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560 004 System Business Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 010 Functional System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562 020 Reliability System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566 030 Usability System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 040 Efficiency System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568 050 Maintainability System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570 060 Portability System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571 A Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 Appendix C – Answers to Sample Questions 575 Index 577
    • xixIntroductionThis is a book on advanced software testing for technical test analysts. By thatwe mean that we address topics that a technical practitioner who has chosensoftware testing as a career should know. We focus on those skills and tech-niques related to test analysis, test design, test tools and automation, test execu-tion, and test results evaluation. We take these topics in a more technicaldirection than in the earlier volume for test analysts by including details of testdesign using structural techniques and details about the use of dynamic analysisto monitor internal status. We assume that you know the basic concepts of testengineering, test design, test tools, testing in the software development lifecycle,and test management. You are ready to mature your level of understanding ofthese concepts and to apply these advanced concepts to your daily work as a testprofessional. This book follows the International Software Testing Qualifications Board’s(ISTQB) Advanced Level Syllabus, with a focus on the material and learningobjectives for the advanced technical test analyst. As such, this book can helpyou prepare for the ISTQB Advanced Level Technical Test Analyst exam. Youcan use this book to self-study for this exam or as part of an e-learning orinstructor-lead course on the topics covered in those exams. If you are taking anISTQB-accredited Advanced Level Technical Test Analyst training course, thisbook is an ideal companion text for that course. However, even if you are not interested in the ISTQB exams, you will findthis book useful to prepare yourself for advanced work in software testing. Ifyou are a test manager, test director, test analyst, technical test analyst, auto-mated test engineer, manual test engineer, programmer, or in any other fieldwhere a sophisticated understanding of software testing is needed—especiallyan understanding of the particularly technical aspects of testing such as white-box testing and test automation—then this book is for you.
    • xx Introduction This book focuses on technical test analysis. It consists of 11 chapters, addressing the following material: 1. Basic aspects of software testing 2. Testing processes 3. Test management 4. Test techniques 5. Testing of software characteristics 6. Reviews 7. Incident (defect) management 8. Standards and test process improvement 9. Test tools and automation 10. People skills (team composition) 11. Preparing for the exam Since the structure follows the structure of the ISTQB Advanced syllabus, some of the chapters address the material in great detail because they are central to the technical test analyst role. Some of the chapters address the material in less detail because the technical test analyst need only be familiar with it. For exam- ple, we cover in detail test techniques—including highly technical techniques like structure-based testing and dynamic analysis and test automation—in this book because these are central to what a technical test analyst does, while we spend less time on test management and no time at all on test process improve- ment. If you have already read Advanced Software Testing: Volume 1, you will notice that there is overlap in some chapters in the book, especially chapters 1, 2, 6, 7, and 10. (There is also some overlap in chapter 4, in the sections on black- box and experience-based testing.) This overlap is inherent in the structure of the ISTQB Advanced syllabus, where both learning objectives and content are common across the two analysis modules in some areas. We spent some time grappling with how to handle this commonality and decided to make this book completely free-standing. That meant that we had to include common material for those who have not read volume 1. If you have read volume 1, you may choose to skip chapters 1, 2, 6, 7, and 10, though people using this book to pre- pare for the Technical Test Analysis exam should read those chapters for review. If you have also read Advanced Software Testing: Volume 2, which is for test managers, you’ll find parallel chapters that address the material in detail but
    • Introduction xxiwith different emphasis. For example, technical test analysts need to know quitea bit about incident management. Technical test analysts spend a lot of time cre-ating incident reports, and you need to know how to do that well. Test managersalso need to know a lot about incident management, but they focus on how tokeep incidents moving through their reporting and resolution lifecycle and howto gather metrics from such reports. What should a technical test analyst be able to do? Or, to ask the questionanother way, what should you have learned to do—or learned to do better—bythe time you finish this book?■ Structure the tasks defined in the test strategy in terms of technical requirements (including the coverage of technically related quality risks)■ Analyze the internal structure of the system in sufficient detail to meet the expected quality level■ Evaluate the system in terms of technical quality attributes such as performance, security, etc.■ Prepare and execute adequate activities, and report on their progress■ Conduct technical testing activities■ Provide the necessary evidence to support evaluations■ Implement the necessary tools and techniques to achieve the defined goalsIn this book, we focus on these main concepts. We suggest that you keep thesehigh-level objectives in mind as we proceed through the material in each of thefollowing chapters. In writing this book, we’ve kept foremost in our minds the question of howto make this material useful to you. If you are using this book to prepare for anISTQB Advanced Level Technical Test Analyst exam, then we recommend thatyou read chapter 11 first, then read the other 10 chapters in order. If you areusing this book to expand your overall understanding of testing to an advancedand highly technical level but do not intend to take the ISTQB Advanced LevelTechnical Test Analyst exam, then we recommend that you read chapters 1through 10 only. If you are using this book as a reference, then feel free to readonly those chapters that are of specific interest to you. Each of the first 10 chapters is divided into sections. For the most part, wehave followed the organization of the ISTQB Advanced syllabus to the point ofsection divisions, but subsections and sub-subsection divisions in the syllabus
    • xxii Introduction might not appear. You’ll also notice that each section starts with a text box describing the learning objectives for this section. If you are curious about how to interpret those K2, K3, and K4 tags in front of each learning objective, and how learning objectives work within the ISTQB syllabus, read chapter 11. Software testing is in many ways similar to playing the piano, cooking a meal, or driving a car. How so? In each case, you can read books about these activities, but until you have practiced, you know very little about how to do it. So we’ve included practical, real-world exercises for the key concepts. We encourage you to practice these concepts with the exercises in the book. Then, make sure you take these concepts and apply them on your projects. You can become an advanced testing professional only by applying advanced test tech- niques to actual software testing. ISTQB Copyright This book is based on the ISTQB Advanced Syllabus version 2007. It also references the ISTQB Foundation Syllabus version 2011. It uses terminology definitions from the ISTQB Glossary version 2.1. These three documents are copyrighted by the ISTQB and used by permission.
    • 11 Test Basics “Read the directions and directly you will be directed in the right direction.” A doorknob in Lewis Carroll’s surreal fantasy, Alice in Wonderland.The first chapter of the Advanced syllabus is concerned with contextual andbackground material that influences the remaining chapters. There are five sec-tions.1. Introduction2. Testing in the Software Lifecycle3. Specific Systems4. Metrics and Measurement5. EthicsLet’s look at each section and how it relates to technical test analysis.1.1 Introduction Learning objectives Recall of content onlyThis chapter, as the name implies, introduces some basic aspects of softwaretesting. These central testing themes have general relevance for testing profes-sionals.There are four major areas:■ Lifecycles and their effects on testing■ Special types of systems and their effects on testing■ Metrics and measures for testing and quality■ Ethical issues
    • 2 1 Test Basics ISTQB Glossary software lifecycle: The period of time that begins when a software product is conceived and ends when the software is no longer available for use. The soft- ware lifecycle typically includes a concept phase, requirements phase, design phase, implementation phase, test phase, installation and checkout phase, operation and maintenance phase, and sometimes, retirement phase. Note that these phases may overlap or be performed iteratively. Many of these concepts are expanded upon in later chapters. This material expands on ideas introduced in the Foundation syllabus. 1.2 Testing in the Software Lifecycle Learning objectives Recall of content only Chapter 2 in the Foundation syllabus discusses integrating testing into the soft- ware lifecycle. As with the Foundation syllabus, in the Advanced syllabus, you should understand that testing must be integrated into the software lifecycle to succeed. This is true whether the particular lifecycle chosen is sequential, incre- mental, iterative, or spiral. Proper alignment between the testing process and other processes in the lifecycle is critical for success. This is especially true at key interfaces and hand- offs between testing and lifecycle activities such as these: ■ Requirements engineering and management ■ Project management ■ Configuration and change management ■ Software development and maintenance ■ Technical support ■ Technical documentation
    • 1.2 Testing in the Software Lifecycle 3Let’s look at two examples of alignment. In a sequential lifecycle model, a key assumption is that the project teamwill define the requirements early in the project and then manage the (hopefullylimited) changes to those requirements during the rest of the project. In such asituation, if the team follows a formal requirements process, an independent testteam in charge of the system test level can follow an analytical requirements-based test strategy. Using a requirements-based strategy in a sequential model, the test teamwould start—early in the project—planning and designing tests following ananalysis of the requirements specification to identify test conditions. This plan-ning, analysis, and design work might identify defects in the requirements,making testing a preventive activity. Failure detection would start later in thelifecycle, once system test execution began. However, suppose the project follows an incremental lifecycle model,adhering to one of the agile methodologies like Scrum. The test team won’treceive a complete set of requirements early in the project, if ever. Instead, thetest team will receive requirements at the beginning of sprint, which typicallylasts anywhere from two to four weeks. Rather than analyzing extensively documented requirements at the outsetof the project, the test team can instead identify and prioritize key quality riskareas associated with the content of each sprint; i.e., they can follow an analyti-cal risk-based test strategy. Specific test designs and implementation will occurimmediately before test execution, potentially reducing the preventive role oftesting. Failure detection starts very early in the project, at the end of the firstsprint, and continues in repetitive, short cycles throughout the project. In such acase, testing activities in the fundamental testing process overlap and are con-current with each other as well as with major activities in the software lifecycle. No matter what the lifecycle—and indeed, especially with the more fast-paced agile lifecycles—good change management and configuration manage-ment are critical for testing. A lack of proper change management results in aninability of the test team to keep up with what the system is and what it shoulddo. As was discussed in the Foundation syllabus, a lack of proper configura-tion management may lead to loss of artifact changes, an inability to say whatwas tested at what point in time, and severe lack of clarity around the mean-ing of the test results.
    • 4 1 Test Basics ISTQB Glossary system of systems: Multiple heterogeneous, distributed systems that are embedded in networks at multiple levels and in multiple interconnected domains, addressing large-scale interdisciplinary common problems and purposes, usually without a common management structure. The Foundation syllabus cited four typical test levels: ■ Unit or component ■ Integration ■ System ■ Acceptance The Foundation syllabus mentioned some reasons for variation in these levels, especially with integration and acceptance. Integration testing can mean component integration testing—integrating a set of components to form a system, testing the builds throughout that process. Or it can mean system integration testing—integrating a set of systems to form a system of systems, testing the system of systems as it emerges from the conglomeration of systems. As discussed in the Foundation syllabus, acceptance test variations include user acceptance tests and regulatory acceptance tests. Along with these four levels and their variants, at the Advanced level you need to keep in mind additional test levels that you might need for your projects. These could include the following: ■ Hardware-software integration testing ■ Feature interaction testing ■ Customer product integration testing You should expect to find most if not all of the following for each level: ■ Clearly defined test goals and scope ■ Traceability to the test basis (if available) ■ Entry and exit criteria, as appropriate both for the level and for the system lifecycle ■ Test deliverables, including results reporting
    • 1.2 Testing in the Software Lifecycle 5■ Test techniques that will be applied, as appropriate for the level, for the team and for the risks inherent in the system■ Measurements and metrics■ Test tools, where applicable and as appropriate for the level■ And, if applicable, compliance with organizational or other standardsWhen RBCS associates perform assessments of test teams, we often find organi-zations that use test levels but that perform them in isolation. Such isolation leadsto inefficiencies and confusion. While these topics are discussed more inAdvanced Software Testing: Volume 2, test analysts should keep in mind that usingdocuments like test policies and frequent contact between test-related staff cancoordinate the test levels to reduce gaps, overlap, and confusion about results. Let’s take a closer look at this concept of alignment. We’ll use the V-modelshown in figure 1-1 as an example. We’ll further assume that we are talkingabout the system test level. Concept System Develop Tests Ca ts ptu es eT re Re nc pta qu ire ce me Ac nts t s Develop Tests Te em De yst sig n/S nS yst tio gra em e Int Develop Tests Im st Te ple nt me ne nt o mp Sy st Co emFigure 1–1 V-model
    • 6 1 Test Basics In the V-model, with a well-aligned test process, test planning occurs concur- rently with project planning. In other words, the moment that the testing team becomes involved is at the very start of the project. Once the test plan is approved, test control begins. Test control continues through to test closure. Analysis, design, implementation, execution, evaluation of exit criteria, and test results reporting are carried out according to the plan. Deviations from the plan are managed. Test analysis starts immediately after or even concurrently with test plan- ning. Test analysis and test design happen concurrently with requirements, high-level design, and low-level design. Test implementation, including test environment implementation, starts during system design and completes just before test execution begins. Test execution begins when the test entry criteria are met. More realistically, test execution starts when most entry criteria are met and any outstanding entry criteria are waived. In V-model theory, the entry criteria would include success- ful completion of both component test and integration test levels. Test execu- tion continues until the test exit criteria are met, though again some of these may often be waived. Evaluation of test exit criteria and reporting of test results occur throughout test execution. Test closure activities occur after test execution is declared complete. This kind of precise alignment of test activities with each other and with the rest of the system lifecycle will not happen simply by accident. Nor can you expect to instill this alignment continuously throughout the process, without any forethought. Rather, for each test level, no matter what the selected software lifecycle and test process, the test manager must perform this alignment. Not only must this happen during test and project planning, but test control includes acting to ensure ongoing alignment. No matter what test process and software lifecycle are chosen, each project has its own quirks. This is especially true for complex projects such as the sys- tems of systems projects common in the military and among RBCS’s larger cli- ents. In such a case, the test manager must plan not only to align test processes, but also to modify them. Off-the-rack process models, whether for testing alone or for the entire software lifecycle, don’t fit such complex projects well.
    • 1.3 Specific Systems 71.3 Specific Systems Learning objectives Recall of content onlyIn this section, we are going to talk about how testing affects—and is affectedby—the need to test two particular types of systems. The first type is systems ofsystems. The second type is safety-critical systems. Systems of systems are independent systems tied together to serve a com-mon purpose. Since they are independent and tied together, they often lack asingle, coherent user or operator interface, a unified data model, compatibleexternal interfaces, and so forth.Systems of systems projects include the following characteristics and risks:■ The integration of commercial off-the-shelf (COTS) software along with some amount of custom development, often taking place over a long period.■ Significant technical, lifecycle, and organizational complexity and heterogeneity. This organizational and lifecycle complexity can include issues of confidentiality, company secrets, and regulations.■ Different development lifecycles and other processes among disparate teams, especially—as is frequently the case—when insourcing, outsourcing, and offshoring are involved.■ Serious potential reliability issues due to intersystem coupling, where one inherently weaker system creates ripple-effect failures across the entire system of systems.■ System integration testing, including interoperability testing, is essential. Well-defined interfaces for testing are needed.At the risk of restating the obvious, systems of systems projects are more com-plex than single-system projects. The complexity increase applies organization-ally, technically, processwise, and teamwise. Good project management, formaldevelopment lifecycles and processes, configuration management, and qualityassurance become more important as size and complexity increase. Let’s focus on the lifecycle implications for a moment.
    • 8 1 Test Basics As mentioned earlier, with systems of systems projects, we are typically going to have multiple levels of integration. First, we will have component inte- gration for each system, and then we’ll have system integration as we build the system of systems. We will also typically have multiple version management and version con- trol systems and processes, unless all the systems happen to be built by the same (presumably large) organization and that organization follows the same approach throughout its software development team. This kind of unified approach to such systems and processes is not something that we commonly see during assessments of large companies, by the way. The duration of projects tends to be long. We have seen them planned for as long as five to seven years. A system of systems project with five or six systems might be considered relatively short and relatively small if it lasted “only” a year and involved “only” 40 or 50 people. Across this project, there are multiple test levels, usually owned by different parties. Because of the size and complexity of the project, it’s easy for handoffs and transfers of responsibility to break down. So, we need formal information trans- fer among project members (especially at milestones), transfers of responsibility within the team, and handoffs. (A handoff, for those of you unfamiliar with the term, is a situation in which some work product is delivered from one group to another and the receiving group must carry out some essential set of activities with that work product.) Even when we’re integrating purely off-the-shelf systems, these systems are evolving. That’s all the more likely to be true with custom systems. So we have the management challenge of coordinating development of the individual sys- tems and the test analyst challenge of proper regression testing at the system of systems level when things change. Especially with off-the-shelf systems, maintenance testing can be trig- gered—sometimes without much warning—by external entities and events such as obsolescence, bankruptcy, or upgrade of an individual system. If you think of the fundamental test process in a system of systems project, the progress of levels is not two-dimensional. Instead, imagine a sort of pyrami- dal structure, as shown in figure 1-2.
    • 1.3 Specific Systems 9 User acceptance test System of systems Systems test System integration test System test Component integration test Component test System A System BFigure 1–2 Fundamental test process in a system of systems projectAt the base, we have component testing. A separate component test level existsfor each system. Moving up the pyramid, you have component integration testing. A sepa-rate component integration test level exists for each system. Next, we have system testing. A separate system test level exists for eachsystem. Note that, for each of these test levels, we have separate organizational own-ership if the systems come from different vendors. We also probably have sepa-rate team ownership since multiple groups often handle component,integration, and system test. Continuing to move up the pyramid, we come to system integration testing.Now, finally, we are talking about a single test level across all systems. Nextabove that is systems testing, focusing on end-to-end tests that span all the sys-tems. Finally, we have user acceptance testing. For each of these test levels, whilewe have single organizational ownership, we probably have separate team own-ership. Let’s move on to safety-critical systems. Simply put, safety-critical systemsare those systems upon which lives depend. Failure of such a system—or eventemporary performance or reliability degradation or undesirable side effects assupport actions are carried out—can injure or kill people or, in the case of mili-tary systems, fail to injure or kill people at a critical juncture of a battle.
    • 10 1 Test Basics ISTQB Glossary safety-critical system: A system whose failure or malfunction may result in death or serious injury to people, or loss or severe damage to equipment, or environmental harm. Safety-critical systems, like systems of systems, have certain associated charac- teristics and risks: ■ Since defects can cause death, and deaths can cause civil and criminal penalties, proof of adequate testing can be and often is used to reduce liability. ■ For obvious reasons, various regulations and standards often apply to safety-critical systems. The regulations and standards can constrain the process, the organizational structure, and the product. Unlike the usual constraints on a project, though, these are constructed specifically to increase the level of quality rather than to enable trade-offs to enhance schedule, budget, or feature outcomes at the expense of quality. Overall, there is a focus on quality as a very important project priority. ■ There is typically a rigorous approach to both development and testing. Throughout the lifecycle, traceability extends all the way from regulatory requirements to test results. This provides a means of demonstrating compliance. This requires extensive, detailed documentation but provides high levels of audit ability, even by non-test experts. Audits are common if regulations are imposed. Demonstrating compliance can involve tracing from the regulatory requirement through development to the test results. An outside party typically performs the audits. Therefore, establish- ing traceability up front and carrying out both from a people and a process point of view. During the lifecycle—often as early as design—the project team uses safety analysis techniques to identify potential problems. As with quality risk analysis, safety analysis will identify risk items that require testing. Single points of fail- ure are often resolved through system redundancy, and the ability of that redun- dancy to alleviate the single point of failure must be tested. In some cases, safety-critical systems are complex systems or even systems of systems. In other cases, non-safety-critical components or systems are inte-
    • 1.4 Metrics and Measurement 11 ISTQB Glossary metric: A measurement scale and the method used for measurement. measurement scale: A scale that constrains the type of data analysis that can be performed on it. measurement: The process of assigning a number or category to an entity to describe an attribute of that entity. measure: The number or category assigned to an attribute of an entity by making a measurement.grated into safety-critical systems or systems of systems. For example, network-ing or communication equipment is not inherently a safety-critical system, butif integrated into an emergency dispatch or military system, it becomes part of asafety-critical system. metric Formal quality risk management is essential in these situations. Fortunately, measurement scalea number of such techniques exist, such as failure mode and effect analysis; fail- measurement measureure mode, effect, and criticality analysis; hazard analysis; and software commoncause failure analysis. We’ll look at a less formal approach to quality risk analysisand management in chapter 3.1.4 Metrics and Measurement Learning objectives Recall of content onlyThroughout this book, we use metrics and measurement to establish expecta-tions and guide testing by those expectations. You can and should apply metricsand measurements throughout the software development lifecycle because well-established metrics and measures, aligned with project goals and objectives, willenable technical test analysts to track and report test and quality results tomanagement in a consistent and coherent way. A lack of metrics and measurements leads to purely subjective assessmentsof quality and testing. This results in disputes over the meaning of test resultstoward the end of the lifecycle. It also results in a lack of clearly perceived andcommunicated value, effectiveness, and efficiency for testing.
    • 12 1 Test Basics Not only must we have metrics and measurements, we also need goals. What is a “good” result for a given metric? An acceptable result? An unaccept- able result? Without defined goals, successful testing is usually impossible. In fact, when we perform assessments for our clients, we more often than not find ill-defined metrics of test team effectiveness and efficiency with no goals and thus bad and unrealistic expectations (which of course aren’t met). We can establish realistic goals for any given metric by establishing a baseline measure for that metric and checking current capability, comparing that baseline against industry averages, and, if appropriate, setting realistic targets for improvement to meet or exceed the industry average. There’s just about no end to what can be subjected to a metric and tracked through measurement. Consider the following: ■ Planned schedule and coverage ■ Requirements and their schedule, resource, and task implications for testing ■ Workload and resource usage ■ Milestones and scope of testing ■ Planned and actual costs ■ Risks; both quality and project risks ■ Defects, including total found, total fixed, current backlog, average closure periods, and configuration, subsystem, priority, or severity distribution During test planning, we establish expectations in the form of goals for the various metrics. As part of test control, we can measure actual outcomes and trends against these goals. As part of test reporting, we can consistently explain to management various important aspects of the process, product, and project, using objective, agreed-upon metrics with realistic, achievable goals. When thinking about a testing metrics and measurement program, there are three main areas to consider: definition, tracking, and reporting. Let’s start with definition. In a successful testing metrics program, you define a useful, pertinent, and concise set of quality and test metrics for a project. You avoid too large a set of metrics because this will prove difficult and perhaps expensive to measure while often confusing rather than enlightening the viewers and stakeholders.
    • 1.4 Metrics and Measurement 13 You also want to ensure uniform, agreed-upon interpretations of these met-rics to minimize disputes and divergent opinions about the meaning of certainmeasures of outcomes, analyses, and trends. There’s no point in having a met-rics program if everyone has an utterly divergent opinion about what particularmeasures mean. Finally, define metrics in terms of objectives and goals for a process or task,for components or systems, and for individuals or teams. Victor Basili’s well-known Goal Question Metric technique is one way toevolve meaningful metrics. (We prefer to use the word objective where Basiliuses goal.) Using this technique, we proceed from the objectives of the effort—in this case, testing—to the questions we’d have to answer to know if we wereachieving those objectives to, ultimately, the specific metrics. For example, one typical objective of testing is to build confidence. One nat-ural question that arises in this regard is, How much of the system has beentested? Metrics for coverage include percentage requirements covered by tests,percentage of branches and statements covered by tests, percentage of interfacescovered by tests, percentage of risks covered by tests, and so forth. Let’s move on to tracking. Since tracking is a recurring activity in a metrics program, the use of auto-mated tool support can reduce the time required to capture, track, analyze,report, and measure the metrics. Be sure to apply objective and subjective analyses for specific metrics overtime, especially when trends emerge that could allow for multiple interpreta-tions of meaning. Try to avoid jumping to conclusions or delivering metrics thatencourage others to do so. Be aware of and manage the tendency for people’s interests to affect theinterpretation they place on a particular metric or measure. Everyone likes tothink they are objective—and, of course, right as well as fair!—but usuallypeople’s interests affect their conclusions. Finally, let’s look at reporting. Most importantly, reporting of metrics and measures should enlightenmanagement and other stakeholders, not confuse or misdirect them. In part,this is achieved through smart definition of metrics and careful tracking, butit is possible to take perfectly clear and meaningful metrics and confusepeople with them through bad presentation. Edward Tufte’s series of books,
    • 14 1 Test Basics ISTQB Glossary ethics: No definition provided in the ISTQB Glossary. ethics starting with The Graphical Display of Quantitative Information, is a treasure trove of ideas about how to develop good charts and graphs for reporting purposes.1 Good testing reports based on metrics should be easily understood, not overly complex and certainly not ambiguous. The reports should draw the viewer’s attention toward what matters most, not toward trivialities. In that way, good testing reports based on metrics and measures will help management guide the project to success. Not all types of graphical displays of metrics are equal—or equally useful. A snapshot of data at a moment in time, as shown in a table, might be the right way to present some information, such as the coverage planned and achieved against certain critical quality risk areas. A graph of a trend over time might be a useful way to present other information, such as the total number of defects reported and the total number of defects resolved since the start of testing. An analysis of causes or relationships might be a useful way to present still other information, such as a scatter plot showing the correlation (or lack thereof) between years of tester experience and percentage of bug reports rejected. 1.5 Ethics Learning objectives Recall of content only Many professions have ethical standards. In the context of professionalism, ethics are “rules of conduct recognized in respect to a particular class of human actions or a particular group, culture, etc.”2 1. The three books of Tufte’s that Rex has read and can strongly recommend on this topic are The Graphical Display of Quantitative Information, Visual Explanations, and Envisioning Information (all published by Graphics Press, Cheshire, CT). 2. Definition from dictionary.com.
    • 1.5 Ethics 15Since, as a technical test analyst, you’ll often have access to confidential andprivileged information, ethical guidelines can help you to use that informationappropriately. In addition, you should use ethical guidelines to choose the bestpossible behaviors and outcomes for a given situation, given your constraints.The phrase “best possible” means for everyone, not just you. Here is an example of ethics in action. One of the authors, Rex Black, ispresident of three related international software testing consultancies, RBCS,RBCS AU/NZ, and Software TestWorx. He also serves on the ISTQB andASTQB boards of directors. As such, he might have and does have insight intothe direction of the ISTQB program that RBCS’ competitors in the softwaretesting consultancy business don’t have. In some cases, such as helping to develop syllabi, Rex has to make thosebusiness interests clear to people, but he is allowed to help do so. Rex helpedwrite both the Foundation and Advanced syllabi. In other cases, such as developing exam questions, Rex agreed, along withhis colleagues on the ASTQB, that he should not participate. Direct access to theexam questions would make it all too likely that, consciously or unconsciously,RBCS would warp its training materials to “teach the exam.” As you advance in your career as a tester, more and more opportunities toshow your ethical nature—or to be betrayed by a lack of it—will come your way.It’s never too early to inculcate a strong sense of ethics.The ISTQB Advanced syllabus makes it clear that the ISTQB expects certificateholders to adhere to the following code of ethics.PUBLIC – Certified software testers shall act consistently with the public inter-est. For example, if you are working on a safety-critical system and are asked toquietly cancel some defect reports, it’s an ethical problem if you do so.CLIENT AND EMPLOYER – Certified software testers shall act in a mannerthat is in the best interests of their client and employer and consistent with thepublic interest. For example, if you know that your employer’s major project isin trouble and you short-sell the stock and then leak information about theproject problems to the Internet, that’s a real ethical lapse—and probably acriminal one too.PRODUCT – Certified software testers shall ensure that the deliverables theyprovide (on the products and systems they test) meet the highest professional
    • 16 1 Test Basics standards possible. For example, if you are working as a consultant and you leave out important details from a test plan so that the client has to hire you on the next project, that’s an ethical lapse. JUDGMENT – Certified software testers shall maintain integrity and indepen- dence in their professional judgment. For example, if a project manager asks you not to report defects in certain areas due to potential business sponsor reac- tions, that’s a blow to your independence and an ethical failure on your part if you comply. MANAGEMENT – Certified software test managers and leaders shall subscribe to and promote an ethical approach to the management of software testing. For example, favoring one tester over another because you would like to establish a romantic relationship with the favored tester’s sister is a serious lapse of mana- gerial ethics. PROFESSION – Certified software testers shall advance the integrity and repu- tation of the profession consistent with the public interest. For example, if you have a chance to explain to your child’s classmates or your spouse’s colleagues what you do, be proud of it and explain the ways software testing benefits society. COLLEAGUES – Certified software testers shall be fair to and supportive of their colleagues and promote cooperation with software developers. For exam- ple, it is unethical to manipulate test results to arrange the firing of a program- mer whom you detest. SELF – Certified software testers shall participate in lifelong learning regarding the practice of their profession and shall promote an ethical approach to the practice of the profession. For example, attending courses, reading books, and speaking at conferences about what you do help to advance yourself—and the profession. This is called doing well while doing good, and fortunately, it is very ethical! 1.6 Sample Exam Questions To end each chapter, you can try one or more sample exam questions to rein- force your knowledge and understanding of the material and to prepare for the ISTQB Advanced Level Technical Test Analyst exam.
    • 1.6 Sample Exam Questions 171 You are working as a test analyst at a bank. At the bank, technical test analysts work closely with users during user acceptance testing. The bank has bought two financial applications as commercial off-the-shelf (COTS) software from large software vendors. Previous history with these vendors has shown that they deliver quality applications that work on their own, but this is the first time the bank will attempt to integrate applications from these two vendors. Which of the following test levels would you expect to be involved in? [Note: There might be more than one right answer.] A Component test B Component integration test C System integration test D Acceptance test2 Which of the following is necessarily true of safety-critical systems? A They are composed of multiple COTS applications. B They are complex systems of systems. C They are systems upon which lives depend. D They are military or intelligence systems.
    • 18 1 Test Basics
    • 192 Testing Processes Do not enter. If the fall does not kill you, the crocodile will. A sign blocking the entrance to a parapet above a pool in the Sydney Wildlife Centre, Australia, guiding people in a safe viewing process for one of many dangerous-fauna exhibits.The second chapter of the Advanced syllabus is concerned with the process oftesting and the activities that occur within that process. It establishes a frame-work for all the subsequent material in the syllabus and allows you to visualizeorganizing principles for the rest of the concepts. There are seven sections.1. Introduction2. Test Process Models3. Test Planning and Control4. Test Analysis and Design5. Test Implementation and Execution6. Evaluating Exit Criteria and Reporting7. Test Closure ActivitiesLet’s look at each section and how it relates to technical test analysis.2.1 Introduction Learning objectives Recall of content only
    • 20 2 Testing Processes The ISTQB Foundation syllabus describes the ISTQB fundamental test process. It provides a generic, customizable test process, shown in figure 2-1. That process consists of the following activities: ■ Planning and control ■ Analysis and design ■ Implementation and execution ■ Evaluating exit criteria and reporting ■ Test closure activities For technical test analysts, we can focus on the middle three activities in the bullet list above. Execution Evaluating exit criteria Reporting test results Control Project Timeline Figure 2–1 ISTQB fundamental test process 2.2 Test Process Models Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section. In the course of studying for the exam, read this section in chapter 2 of the Advanced syllabus for general recall and familiarity only.
    • 2.3 Test Planning and Control 21 ISTQB Glossary test planning: The activity of establishing or updating a test plan. test plan: A document describing the scope, approach, resources and schedule of intended test activities. It identifies, among other test items, the features to be tested, the testing tasks, who will do each task, the degree of tester inde- pendence, the test environment, the test design techniques and entry and exit criteria to be used, and the rationale for their choice, and any risks requiring contingency planning. It is a record of the test planning process.2.3 Test Planning and Control Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section. In thecourse of studying for the exam, read this section in chapter 2 of the Advancedsyllabus for general recall and familiarity only.2.4 Test Analysis and Design Learning objectives (K2) Explain the stages in an application’s lifecycle where non- functional tests and architecture-based tests may be applied. Explain the causes of non-functional testing taking place only in specific stages of an application’s lifecycle. (K2) Give examples of the criteria that influence the structure and level of test condition development. (K2) Describe how test analysis and design are static testing techniques that can be used to discover defects. (K2) Explain by giving examples the concept of test oracles and how a test oracle can be used in test specifications.
    • 22 2 Testing Processes ISTQB Glossary test case: A set of input values, execution preconditions, expected results, and execution postconditions developed for a particular objective or test condi- tion, such as to exercise a particular program path or to verify compliance with a specific requirement. test condition: An item or event of a component or system that could be veri- fied by one or more test cases, e.g., a function, transaction, feature, quality attribute, or structural element. During the test planning activities in the test process, test leads and test manag- ers work with project stakeholders to identify test objectives. In the IEEE 829 test plan template—which was introduced at the Foundation Level and which we’ll review later in this book—the lead or manager can document these in the section “Features to be Tested.” The test objectives are a major deliverable for technical test analysts because without them, we wouldn’t know what to test. During test analysis and design activities, we use these test objectives as our guide to carry out two main subac- tivities: ■ Identify and refine the test conditions for each test objective ■ Create test cases that exercise the identified test conditions However, test objectives are not enough by themselves. We not only need to know what to test, but in what order and how much. Because of time con- straints, the desire to test the most important areas first, and the need to expend our test effort in the most effective and efficient manner possible, we need to prioritize the test conditions. When following a risk-based testing strategy—which we’ll discuss in detail in chapter 3—the test conditions are quality risk items identified during quality risk analysis. The assignment of priority for each test condition usually involves determining the likelihood and impact associated with each quality risk item; i.e., we assess the level of risk for each risk item. The priority determines the allocation of test effort (throughout the test process) and the order of design, implementation, and execution of the related tests.
    • 2.4 Test Analysis and Design 23 ISTQB Glossary exit criteria: The set of generic and specific conditions, agreed upon with the stakeholders, for permitting a process to be officially completed. The purpose of exit criteria is to prevent a task from being considered complete when there are still outstanding parts of the task which have not been finished. Exit criteria are used to report against and to plan when to stop testing. exit criteriaThroughout the process, the specific test conditions and their associated priori-ties can change as the needs—and our understanding of the needs—of theproject and project stakeholders evolve. This prioritization, use of prioritization, and reprioritization occurs regu-larly in the test process. It starts during risk analysis and test planning, ofcourse. It continues throughout the process, from analysis and design toimplementation and execution. It influences evaluation of exit criteria andreporting of test results.2.4.1 Non-functional Test ObjectivesBefore we get deeper into this process, let’s look at an example of non-functionaltest objectives. First, it’s important to remember that non-functional test objectives canapply to any test level and exist throughout the lifecycle. Too often majornon-functional test objectives are not addressed until the very end of theproject, resulting in much wailing and gnashing of teeth when show-stoppingfailures are found. Consider a video game as an example. For a video game, the ability tointeract with the screen in real time, with no perceptible delays, is a key non-functional test objective. Every subsystem of the game must interact and per-form efficiently to achieve this goal. To be smart about this testing, execution efficiency to enable timelyprocessing should be tested at the unit, integration, system, and acceptancelevels. Finding a serious bottleneck during system test would affect the sched-ule, and that’s not good for a consumer product like a game—or any otherkind of software or system, for that matter.
    • 24 2 Testing Processes ISTQB Glossary test execution: The process of running a test on the component or system under test, producing actual result(s). test execution Furthermore, why wait until test execution starts at any level, early or late? Instead, start with reviews of requirements specifications, design specifications, and code to assess this function as well. Many times, non-functional quality characteristics can be quantified; in this case, we might have actual performance requirements which we can test throughout the various test levels. For example, suppose that key events must be processed within 3 milliseconds of input to a specific component to be able to meet performance standards; we can test if the component actually meets that measure. In other cases, the requirements might be implicit rather than explicit: the system must be “fast enough.” Some types of non-functional testing should clearly be performed as early as possible. As an example, for many projects we have worked on, perfor- mance testing was only done late in system testing. The thought was that it could not be done earlier because the functional testing had not yet been done, so end-to-end testing would not be possible. Then, when serious bottle- necks resulting in extremely slow performance were discovered in the sys- tem, the release schedule was severely impacted. Other projects targeted performance testing as critical. At the unit and component testing level, measurements were made as to time required for processing through the objects. At integration testing, subsystems were benchmarked to make sure they could perform in an optimal way. As system testing started, performance testing was a crucial piece of the planning and started as soon as some functionality was available, even though the system was not yet feature complete. All of this takes time and resources, planning and effort. Some perfor- mance tools are not really useful too early in the process, but measurements can still be taken using simpler tools. Other non-functional testing may not make sense until late in the Soft- ware Development Life Cycle (SDLC). While error and exception handling can be tested in unit and integration testing, full blown recoverability testing
    • 2.4 Test Analysis and Design 25really makes sense only in the system, acceptance, and system integrationphases when the response of the entire system can be measured.2.4.2 Identifying and Documenting Test ConditionsTo identify test conditions, we can perform analysis of the test basis, the testobjectives, the quality risks, and so forth using any and all information inputsand sources we have available. For analytical risk-based testing strategies, we’llcover exactly how this works in chapter 3. If you’re not using analytical risk-based testing, then you’ll need to selectthe specific inputs and techniques according to the test strategy or strategiesyou are following. Those strategies, inputs, and techniques should align withthe test plan or plans, of course, as well as with any broader test policies ortest handbooks. Now, in this book, we’re concerned primarily with the technical test ana-lyst role. So we address both functional tests (especially from a technical per-spective) and non-functional tests. The analysis activities can and shouldidentify functional and non-functional test conditions. We should considerthe level and structure of the test conditions for use in addressing functionaland non-functional characteristics of the test items.There are two important choices when identifying and documenting test condi-tions:■ The structure of the documentation for the test conditions■ The level of detail we need to describe the test conditions in the documen- tationThere are many common ways to determine the level of detail and structure ofthe test conditions. One is to work in parallel with the test basis documents. For example, ifyou have a marketing requirements document and a system requirementsdocument in your organization, the former is usually high level and the latteris low level. You can use the marketing requirements document to generatethe high-level test conditions and then use the system requirements docu-ment to elaborate one or more low-level test conditions underneath eachhigh-level test condition.
    • 26 2 Testing Processes Another approach is often used with quality risk analysis (sometimes called product risk analysis). In this approach, we can outline the key features and quality characteristics at a high level. We can then identify one or more detailed quality risk items for each feature or characteristic. These quality risk items are thus the test conditions. Another approach, if you have only detailed requirements, is to go directly to the low-level requirements. In this case, traceability from the detailed test conditions to the requirements (which impose the structure) is needed for management reporting and to document what the test is to estab- lish. Yet another approach is to identify high-level test conditions only, some- times without any formal test bases. For example, in exploratory testing some advocate the documentation of test conditions in the form of test charters. At that point, there is little to no additional detail created for the unscripted or barely scripted tests. Again, it’s important to remember that the chosen level of detail and the structure must align with the test strategy or strategies, and those strategies should align with the test plan or plans, of course, as well as with any broader test policies or test handbooks. Also, remember that it’s easy to capture traceability information while you’re deriving test conditions from test basis documents like requirements, designs, use cases, user manuals, and so forth. It’s much harder to re-create that information later by inspection of test cases. Let’s look at an example of applying a risk-based testing strategy to this step of identifying test conditions. Suppose you are working on an online banking system project. During a risk analysis session, system response time, a key aspect of system perfor- mance, is identified as a high-risk area for the online banking system. Several different failures are possible, each with its own likelihood and impact. So, discussions with the stakeholders lead us to elaborate the system perfor- mance risk area, identifying three specific quality risk items: ■ Slow response time during login ■ Slow response time during queries ■ Slow response time during a transfer transaction
    • 2.4 Test Analysis and Design 27 ISTQB Glossary test design: (1) See test design specification. (2) The process of transforming general testing objectives into tangible test conditions and test cases. test design specification: A document specifying the test conditions (cover- age items) for a test item and the detailed test approach and identifying the associated high-level test cases. high-level test case: A test case without concrete (implementation-level) values for input data and expected results. Logical operators are used; instances of the actual values are not yet defined and/or available. low-level test case: A test case with concrete (implementation-level) values for input data and expected results. Logical operators from high-level test cases are replaced by actual values that correspond to the objectives of the logical operators.At this point, the level of detail is specific enough that the risk analysis team canassign specific likelihood and impact ratings for each risk item. Now that we have test conditions, the next step is usually to elaborate thoseinto test cases. We say “usually” because some test strategies, like the reactiveones discussed in the Foundation syllabus and in this book in chapter 4, don’talways use written test cases. For the moment, let’s assume that we want to spec-ify test cases that are repeatable, verifiable, and traceable back to requirements,quality risk, or whatever else our tests are based on. If we are going to create test cases, then, for a given test condition—ortwo or more related test conditions—we can apply various test design tech-niques to create test cases. These techniques are covered in chapter 4. Keep inmind that you can and should blend techniques in a single test case. We mentioned traceability to the requirements, quality risks, and other testbases. Some of those other test bases for technical test analysts can includedesigns (high or low level), architectural documents, class diagrams, objecthierarchies, and even the code itself. We can capture traceability directly, byrelating the test case to the test basis element or elements that gave rise to thetest conditions from which we created the test case. Alternatively, we can relatethe test case to the test conditions, which are in turn related to the test basiselements.
    • 28 2 Testing Processes ISTQB Glossary test implementation: The process of developing and prioritizing test proce- dures, creating test data, and, optionally, preparing test harnesses and writing automated test scripts. As with test conditions, we’ll need to select a level of detail and structure for our test implementation test cases. It’s important to remember that the chosen level of detail and the structure must align with the test strategy or strategies. Those strategies should align with the test plan or plans, of course, as well as with any broader test poli- cies or test handbooks. So, can we say anything else about the test design process? Well, the specific process of test design depends on the technique. However, it typically involves defining the following: ■ Preconditions ■ Test environment requirements ■ Test inputs and other test data requirements ■ Expected results ■ Postconditions Defining the expected result of a test can be tricky, especially as expected results are not only screen outputs, but also data and environmental post conditions. Solving this problem requires that we have what’s called a test oracle, which we’ll look at in a moment. First, though, notice the mention of test environment requirements in the preceding bullet list. This is an area of fuzziness in the ISTQB fundamental test process. Where is the line between test design and test implementation, exactly? The Advanced syllabus says, “[D]uring test design the required detailed test infrastructure requirements may be defined, although in practice these may not be finalized until test implementation.” Okay, but maybe we’re doing some implementation as part of the design? Can’t the two overlap? To us, trying to draw sharp distinctions results in many questions along the lines of, How many angels can dance on the head of a pin?
    • 2.4 Test Analysis and Design 29 Whatever we call defining test environments and infrastructures—design,implementation, environment setup, or some other name—it is vital to remem-ber that testing involves more than just the test objects and the testware. Thereis a test environment, and this isn’t just hardware. It includes rooms, equipment,personnel, software, tools, peripherals, communications equipment, userauthorizations, and all other items required to run the tests.2.4.3 Test OraclesOkay, let’s look at what test oracles are and what oracle-related problems thetechnical test analyst faces. A test oracle is a source we use to determine the expected results of a test.We can compare these expected results with the actual results when we run atest. Sometimes the oracle is the existing system. Sometimes it’s a user man-ual. Sometimes it’s an individual’s specialized knowledge. Rex usually says thatwe should never use the code itself as an oracle, even for structural testing,because that’s simply testing that the compiler, operating system, and hard-ware work. Jamie feels that the code can serve as a useful partial oracle, say-ing it doesn’t hurt to consider it, though he agrees with Rex that it should notserve as the sole oracle. So, what is the oracle problem? Well, if you haven’t experienced this first-hand, ask yourself, in general, how we know what “correct results” are for atest? The difficulty of determining the correct result is the oracle problem. If you’ve just entered the workforce from the ivory towers of academia,you might have learned about perfect software engineering projects. You mayhave heard stories about detailed, clear, and consistent test bases like require-ments and design specifications that define all expected results. Those storieswere myths. In the real world, on real projects, test basis documents like requirementsare vague. Two documents, such as a marketing requirements document anda system requirements document, will often contradict each other. These doc-uments may have gaps, omitting any discussion of important characteristics ofthe product—especially non-functional characteristics, and especially usabil-ity and user interface characteristics. Sometimes these documents are missing entirely. Sometimes they existbut are so superficial as to be useless. One of our clients showed Rex a hand-
    • 30 2 Testing Processes written scrawl on a letter-size piece of paper, complete with crude illustra- tions, which was all the test team had received by way of requirements on a project that involved 100 or so person-months of effort. We have both worked on projects where even that would be an improvement over what we actually received! When test basis documents are delivered, they are often delivered late, often too late to wait for them to be done before we begin test design (at least if we want to finish test design before we start test execution). Even with the best intentions on the part of business analysts, sales and marketing staff, and users, test basis documents won’t be perfect. Real-world applications are com- plex and not entirely amenable to complete, unambiguous specification. So we have to augment the written test basis documents we receive with tester expertise or access to expertise, along with judgment and professional pessimism. Using all available oracles—written and mental, provided and derived—the tester can define expected results before and during test execu- tion. Since we’ve been talking a lot about requirements, you might assume that the oracle problem applies only to high-level test levels like system test and acceptance test. Nope. The oracle problem—and its solutions—apply to all test levels. The test bases will vary from one level to another, though. Higher test levels like user acceptance test and system test rely more on requirements spec- ification, use cases, and defined business processes. Lower test levels like com- ponent test and integration test rely more on low-level design specification While this is a hassle, remember that you must solve the oracle problem in your testing. If you run tests with no way to evaluate the results, you are wasting your time. You will provide low, zero, or negative value to the team. Such testing generates false positives and false negatives. It distracts the team with spurious results of all kinds. It creates false confidence in the system. By the way, as for our sarcastic aside about the “ivory tower of academia” a moment ago, let us mention that, when Rex studied computer science at UCLA quite a few years ago, one of his software engineering professors told him about this problem right from the start. One of Jamie’s professors at Lehigh said that that complete requirements were more mythological than unicorns. Neither of us could say we weren’t warned! Let’s look at an example of a test oracle, from the real world.
    • 2.4 Test Analysis and Design 31 Rex and his associates worked on a project to develop a banking applica-tion to replace a legacy system. There were two test oracles. One was therequirements specification, such as it was. The other was the legacy system.They faced two challenges. For one thing, the requirements were vague. The original concept of theproject, from the vendor’s side, was “Give the customer whatever the cus-tomer wants,” which they then realized was a good way to go bankrupt giventhe indecisive and conflicting ideas about what the system should do amongthe customer’s users. The requirements were the outcome of a belated effortto put more structure around the project. For another thing, sometimes the new system differed from the legacysystem in minor ways. In one infamous situation, there was a single bugreport that they opened, then deferred, then reopened, then deferred again, atleast four or five times. It described situations where the monthly paymentvaried by $0.01. The absence of any reliable, authoritative, consistent set of oracles led to alot of “bug report ping-pong.” They also had bug report prioritization issuesas people argued over whether some problems were problems at all. They hadhigh rates of false positives and negatives. The entire team—including the testteam—was frustrated. So, you can see that the oracle problem is not someabstract concept; it has real-world consequences.2.4.4 StandardsAt this point, let’s review some standards from the Foundation that will be use-ful in test analysis and design. First, let’s look at two documentation templates you can use to captureinformation as you analyze and design your tests, assuming you intend todocument what you are doing, which is usually true. The first is the IEEE 829test design specification. Remember from the Foundation course that a test condition is an item orevent of a component or system that could be verified by one or more testcases, e.g., a function, transaction, feature, quality attribute, identified risk, orstructural element. The IEEE 829 test design specification describes a condi-tion, feature or small set of interrelated features to be tested and the set oftests that cover them at a very high or logical level. The number of tests
    • 32 2 Testing Processes required should be commensurate with the risks we are trying to mitigate (as reflected in the pass/fail criteria.) The design specification template includes the following sections: ■ Test design specification identifier (following whatever standard your company uses for document identification) ■ Features to be tested (in this test suite) ■ Approach refinements (specific techniques, tools, etc.) ■ Test identification (tracing to test cases in suites) ■ Feature pass/fail criteria (e.g., how we intend to determine whether a fea- ture works, such as via a test oracle, a test basis document, or a legacy sys- tem) The collection of test cases outlined in the test design specification is often called a test suite. The sequencing of test suites and cases within suites is often driven by risk and business priority. Of course, project constraints, resources, and progress must affect the sequencing of test suites. Next comes the IEEE 829 test case specification. A test case specification describes the details of a test case. This template includes the following sections: ■ Test case specification identifier ■ Test items (what is to be delivered and tested) ■ Input specifications (user inputs, files, etc.) ■ Output specifications (expected results, including screens, files, timing, behaviors of various sorts, etc.) ■ Environmental needs (hardware, software, people, props, and so forth) ■ Special procedural requirements (operator intervention, permissions, etc.) ■ Intercase dependencies (if needed to set up preconditions) While this template defines a standard for contents, many other attributes of a test case are left as open questions. In practice, test cases vary significantly in effort, duration, and number of test conditions covered. We’ll return to the IEEE 829 standard again in the next section. However, let us also review another related topic from the Foundation syllabus, on the matter of documentation.
    • 2.4 Test Analysis and Design 33In the real world, the extent of test documentation varies considerably. It wouldbe hard to list all the different reasons for this variance, but they include the fol-lowing:■ Risks to the project created by documenting or not documenting.■ How much value, if any, the test documentation creates—and is meant to create.■ Any standards that are or should be followed, including the possibility of an audit to ensure compliance with those standards.■ The software development lifecycle model used. Advocates of agile approaches try to minimize documentation by ensuring close and frequent team communication.■ The extent to which we must provide traceability from the test basis to the test cases.The key idea here is to remember to keep an open mind and a clear head whendeciding how much to document. Now, since we focus on both functional and non-functional characteris-tics as part of this technical test analyst volume, let’s review the ISO 9126standard. The ISO 9126 quality standard for software defines six software qualitycharacteristics: functionality, reliability, usability, efficiency, maintainability,and portability. Each characteristic has three or more subcharacteristics, asshown in figure 2-2. Tests that address functionality and its subcharacteristics are functionaltests. These were the main topics in the first volume of this series, for test ana-lysts. We will revisit them here, but primarily from a technical perspective. Teststhat address the other five characteristics and their subcharacteristics are non-functional tests. These are among the main topics for this book. Finally, keep inmind that, when you are testing hardware/software systems, additional qualitycharacteristics can and will apply.
    • 34 2 Testing Processes Characteristic Functionality: Suitability, accuracy, Addressed by interoperability, security, compliance functional tests Subchar- acteristic Reliability: Maturity (robustness), fault tolerance, recoverability, compliance Usability: Understandability, learnability, operability, attractiveness, compliance Efficiency: Time behaviour, resource Addressed by non- utilization, compliance functional tests Maintainability: Analyzability, changeability, stability, testability, compliance Portability: Adaptability, installability, coexistence, replaceability, compliance Figure 2–2 ISO 9126 quality standard 2.4.5 Static Tests Now, let’s review three important ideas from the Foundation syllabus. One is the value of static testing early in the lifecycle to catch defects when they are cheap and easy to fix. The next is the preventive role testing can play when involved early in the lifecycle. The last is that testing should be involved early in the project. These three ideas are related because technical test analysis and design is a form of static testing; it is synergistic with other forms of static testing, and we can exploit that synergy only if we are involved at the right time. Notice that, depending on when the analysis and design work is done, you could possibly define test conditions and test cases in parallel with reviews and static analyses of the test basis. In fact, you could prepare for a requirements review meeting by doing test analysis and design on the require- ments. Test analysis and design can serve as a structured, failure-focused static test of a requirements specification that generates useful inputs to a requirements review meeting. Of course, we should also take advantage of the ideas of static testing, and early involvement if we can, to have test and non-test stakeholders participate in reviews of various test work products, including risk analyses, test designs,
    • 2.4 Test Analysis and Design 35test cases, and test plans. We should also use appropriate static analysis tech-niques on these work products. Let’s look at an example of how test analysis can serve as a static test. Sup-pose you are following an analytical risk-based testing strategy. If so, then inaddition to quality risk items—which are the test conditions—a typical qual-ity risk analysis session can provide other useful deliverables. We refer to these additional useful deliverables as by-products, along thelines of industrial by-products, in that they are generated by the way as youcreate the target work product, which in this case is a quality risk analysisdocument. These by-products are generated when you and the other partici-pants in the quality risk analysis process notice aspects of the project youhaven’t considered before.These by-products include the following:■ Project risks—things that could happen and endanger the success of the project■ Identification of defects in the requirements specification, design specifica- tion, or other documents used as inputs into the quality risk analysis■ A list of implementation assumptions and simplifications, which can improve the design as well as set up checkpoints you can use to ensure that your risk analysis is aligned with actual implementation laterBy directing these by-products to the appropriate members of the project team,you can prevent defects from escaping to later stages of the software lifecycle.That’s always a good thing.2.4.6 MetricsTo close this section, let’s look at metrics and measurements for test analysis anddesign. To measure completeness of this portion of the test process, we canmeasure the following:■ Percentage of requirements or quality (product) risks covered by test conditions■ Percentage of test conditions covered by test cases■ Number of defects found during test analysis and design
    • 36 2 Testing Processes We can track test analysis and design tasks against a work breakdown structure, which is useful in determining whether we are proceeding according to the esti- mate and schedule. 2.5 Test Implementation and Execution Learning objectives (K2) Describe the preconditions for test execution, including testware, test environment, configuration management, and defect management. Test implementation includes all the remaining tasks necessary to enable test case execution to begin. At this point, remember, we have done our analysis and design work, so what remains? For one thing, if we intend to use explicitly specified test procedures— rather than relying on the tester’s knowledge of the system—we’ll need to organize the test cases into test procedures (or, if using automation, test scripts). When we say “organize the test cases,” we mean, at the very least, document the steps to carry out the test. How much detail do we put in these procedures? Well, the same considerations that lead to more (or less) detail at the test condition and test case level would apply here. For example, if a regulatory standard like the United States Federal Aviation Administration’s DO-178B applies, that’s going to require a high level of detail. Since testing frequently requires test data for both inputs and the test envi- ronment itself, we need to make sure that data is available now. In addition, we must set up the test environments. Are both the test data and the test environ- ments in a state such that we can use them for testing now? If not, we must resolve that problem before test execution starts. In some cases test data require the use of data generation tools or production data. Ensuring proper test environment configuration can require the use of configuration management tools. With the test procedures in hand, we need to put together a test execu- tion schedule. Who is to run the tests? In what order should they run them? What environments are needed for which tests? When should we run the
    • 2.5 Test Implementation and Execution 37 ISTQB Glossary test procedure: See test procedure specification. test procedure specification: A document specifying a sequence of actions for the execution of a test. Also known as test script or manual test script. test script: Commonly used to refer to a test procedure specification, espe- cially an automated one.automated tests? If automated tests run in the same environment as manualtests, how do we schedule the tests to prevent undesirable interactions test procedurebetween the automated and manual tests? We need to answer these questions. test procedure specificati- Finally, since we’re about to start test execution, we need to check whether on test scriptall explicit and implicit entry criteria are met. If not, we need to work withproject stakeholders to make sure they are met before the scheduled test exe-cution start date. Now, keep in mind that you should prioritize and schedule the test proce-dures to ensure that you achieve the objectives in the test strategy in the mostefficient way. For example, in risk-based testing, we usually try to run tests inrisk priority order. Of course, real-world constraints like availability of testconfigurations can change that order. Efficiency considerations like theamount of data or environment restoration that must happen after a test isover can change that order too. Let’s look more closely at two key areas, readiness of test procedures andreadiness of test environments.2.5.1 Test Procedure ReadinessAre the test procedures ready to run? Let’s examine some of the issues we needto address before we know the answer. As mentioned earlier, we must have established clear sequencing for thetest procedures. This includes identifying who is to run the test procedure,when, in what test environment, with what data. We have to evaluate constraints that might require tests to run in a partic-ular order. Suppose we have a sequence of test procedures that together makeup an end-to-end workflow? There are probably business rules that governthe order in which those test procedures must run.
    • 38 2 Testing Processes So, based on all the practical considerations as well as the theoretical ideal of test procedure order—from most important to least important—we need to finalize the order of the test procedures. That includes confirming that order with the test team and other stakeholders. In the process of confirming the order of test procedures, you might find that the order you think you should follow is in fact impossible or perhaps unacceptably less efficient than some other possible sequencing. We also might have to take steps to enable test automation. Of course, we say “might have to take steps” rather than “must take steps” because not all test efforts involve automation. However, as a technical test analyst, imple- menting automated testing is a key responsibility, one which we’ll discuss in detail later in this book. If some tests are automated, we’ll have to determine how those fit into the test sequence. It’s very easy for automated tests, if run in the same environ- ment as manual tests, to damage or corrupt test data, sometimes in a way that causes both the manual and automated tests to generate huge numbers of false positives and false negatives. Guess what? That means you get to run the tests all over again. We don’t want that! Now, the Advanced syllabus says that we will create the test harness and test scripts during test implementation. Well, that’s theoretically true, but as a practical matter we really need the test harness ready weeks, if not months, before we start to use it to automate test scripts. We definitely need to know all the test procedure dependencies. If we find that there are reasons—due to these dependencies—we can’t run the test procedures in the sequence we established earlier, we have two choices: One, we can change the sequence to fit the various obstacles we have discovered. Or, two, we can remove the obstacles. Let’s look more closely at two very common categories of test procedure dependencies—and thus obstacles. The first is the test environment. You need to know what is required for each test procedure. Now, check to see if that environment will be available during the time you have that test procedure scheduled to run. Notice that “available” means not only is the test environment configured, but also no other test procedure—or any other test activity for that matter—that would interfere with the test procedure under consideration is scheduled to use that test environment during the same period of time.
    • 2.5 Test Implementation and Execution 39 The interference question is usually where the obstacles emerge. How-ever, for complex and evolving test environments, the mere configuration ofthe test environment can become a problem. Rex worked on a project a whileback that was so complex that he had to construct a special database to track,report, and manage the relationships between test procedures and the testenvironments they required. The second category of test procedure dependencies is the test data. Youneed to know what data each test procedure requires. Now, similar to the pro-cess before, check to see if that data will be available during the time you havethat test procedure scheduled to run. As before, “available” means not only isthe test data created, but also no other test procedure—or any other test activ-ity for that matter—that would interfere with the viability and accessibility ofthe data is scheduled to use that test data during the same period of time. With test data, interference is again often a large issue. We had a clientwho tried to run manual tests during the day and automated tests overnight.This resulted in lots of problems until a process was evolved to properlyrestore the data at the handover points between manual testing and auto-mated testing (at the end of the day) and between automated testing andmanual testing (at the start of the day).2.5.2 Test Environment ReadinessAre the test environments ready to use? Let’s examine some of the issues weneed to address before we know the answer. First, let’s make clear the importance of a properly configured test environ-ment. If we run the test procedures perfectly but use an improperly configuredtest environment, we obtain useless test results. Specifically, we get many falsepositives. A false positive in software testing is analogous to one in medicine—atest that should have passed instead fails, leading to wasted time analyzing“defects” that turn out to be test environment problems. Often the false positivesare so large in number that we also get false negatives. This happens when a testthat should have failed instead passes, often in this case because we didn’t see ithiding among the false positives. The overall outcomes are low defect detectioneffectiveness; high field or production failure rates; high defect report rejectionrates; a lot of wasted time for testers, managers, and developers; and a severeloss of credibility for the test team. Obviously those are all very bad outcomes.
    • 40 2 Testing Processes ISTQB Glossary false negative: See false-pass result. false-pass result: A test result which fails to identify the presence of a defect that is actually present in the test object. false positive: See false-fail result. false-fail result: A test result in which a defect is reported although no such defect actually exists in the test object. So, we have to ensure properly configured test environments. Now, in the ISTQB fundamental test process, implementation is the point where this happens. As false negative with automation, though, we feel this is probably too late, at least if implementa- false-pass result tion is an activity that starts after analysis and design. If, instead, implementation false positive false-fail result of the test environment is a subset of the overall implementation activity and can start as soon as the test plan is done, then we are in better shape. What is a properly configured test environment and what does it do for us? For one thing, a properly configured test environment enables finding defects under the test conditions we intend to run. For example, if we want to test for performance, it allows us to find unexpected bottlenecks that would slow down the system. For another thing, a properly configured test environment operates nor- mally when failures are not occurring. In other words, it doesn’t generate many false positives. Additionally, at higher levels of testing such as system test and system integration test, a properly configured test environment replicates the produc- tion or end-user environment. Many defects, especially non-functional defects like performance and reliability problems, are hard if not impossible to find in scaled-down environments. There are some other things we need for a properly configured test environ- ment. We’ll need someone to set up and support the environment. For complex environments, this person is usually someone outside the test team. (Jamie and Rex both would prefer the situation in which the person is part of the test team, due to the problems of environment support availability that seem to arise with reliance on external resources, but the preferences of the test manager don’t
    • 2.5 Test Implementation and Execution 41 ISTQB Glossary test log: A chronological record of relevant details about the execution of tests. test logging: The process of recording information about tests executed into a test log. test log test loggingalways result in the reallocation of the individual.) We also need to make suresomeone—perhaps a tester, perhaps someone else—has loaded the testware, testsupport tools, and associated processes on the test environment. Test support tools include, at the least, configuration management, inci-dent management, test logging, and test management. Also, you’ll need proce-dures to gather data for exit criteria evaluation and test results reporting.Ideally, your test management system will handle some of that for you.2.5.3 Blended Test StrategiesIt is often a good idea to use a blend of test strategies, leading to a balanced testapproach throughout testing, including during test implementation. For exam-ple, when RBCS associates run test projects, we typically blend analytical risk-based test strategies with analytical requirements-based test strategies anddynamic test strategies (also referred to as reactive test strategies). We reservesome percentage (often 10 to 20 percent) of the test execution effort for testingthat does not follow predetermined scripts. Analytical strategies follow the ISTQB fundamental test process nicely,with work products produced along the way. However, the risk with blendedstrategies is that the reactive portion can get out of control. Testing withoutscripts should not be ad hoc or aimless. Such tests are unpredictable in dura-tion and coverage. Some techniques like session-based test management, which is covered inthe companion volume on test management, can help deal with that inherentcontrol problem in reactive strategies. In addition, we can use experience-based test techniques such as attacks, error guessing, and exploratory testingto structure reactive test strategies. We’ll discuss these topics further in chap-ter 4.
    • 42 2 Testing Processes The common trait of a reactive test strategy is that most of the testing involves reacting to the actual system presented to us. This means that test analysis, test design, and test implementation occur primarily during test exe- cution. In other words, reactive test strategies allow—indeed, require—that the results of each test influence the analysis, design, and implementation of the subsequent tests. As discussed in the Foundation syllabus, these reactive strategies are lightweight in terms of total effort both before and during test execution. Experience-based test techniques are often efficient bug finders, sometimes 5 or 10 times more efficient than scripted techniques. However, being experi- ence based, naturally enough, they require expert testers. As mentioned ear- lier, reactive test strategies result in test execution periods that are sometimes unpredictable in duration. Their lightweight nature means they don’t provide much coverage information and are difficult to repeat for regression testing. Some claim that tools can address this coverage and repeatability problem, but we’ve never seen that work in actual practice. That said, when reactive test strategies are blended with analytical test strategies, they tend to balance each other’s weak spots. This is analogous to blended scotch whiskey. Blended scotch whiskey consists of malt whiskey— either a single malt or more frequently a combination of various malt whis- keys—blended with grain alcohol (basically, vodka). It’s hard to imagine two liquors more different than vodka and single malt scotch, but together they produce a liquor that many people find much easier to drink and enjoy than the more assertive, sometimes almost medicinal single malts. 2.5.4 Starting Test Execution Before we start test execution, it’s a best practice to measure readiness based on some predefined preconditions, often called entry criteria. Entry criteria were discussed in the Foundation syllabus, in the chapter on test management. Often we will have formal entry criteria for test execution. Consider the fol- lowing entry criteria, taken from an actual project: 1. Bug tracking and test tracking systems are in place. 2. All components are under formal, automated configuration management and release management control.
    • 2.5 Test Implementation and Execution 433. The operations team has configured the system test server environment, including all target hardware components and subsystems. The test team has been provided with appropriate access to these systems.These were extracted from the entry criteria section of the test plan for thatproject. They focus on three typical areas that can adversely affect the value oftesting if unready: the test environment (as discussed earlier), configurationmanagement, and defect management.On the other hand, sometimes projects have less-formal entry criteria for exe-cution. For example, test teams often simply assume that the test cases and testdata will be ready, and there is no explicit measurement of them.Whether the entry criteria are formal or informal, we need to determine if we’reready to start test execution. To do so, we need the delivery of a testable testobject or objects and the satisfaction (or waiver) of entry criteria. Of course, this presumes that the entry criteria alone are enough toensure that the various necessary implementation tasks discussed earlier inthis section are complete. If not, then we have to go back and check thoseissues of test data, test environments, test dependencies, and so forth. Now, during test execution, people will run the manual test cases via thetest procedures. To execute a test procedure to completion, we’d expect that atleast two things had happened. First, we covered all of the test conditions orquality risk items traceable to the test procedure. Second, we carried out all ofthe steps of the test procedure. You might ask, “How could I carry out the test procedure without cover-ing all the risks or conditions?” In some cases, the test procedure is written ata high level. In that case, you would need to understand what the test wasabout and augment the written test procedure with on-the-fly details thatensure that you cover the right areas. You might also ask, “How could I cover all the risks and conditions with-out carrying out the entire test procedure?” Some steps of a test procedureserve to enable testing rather than covering conditions or risks. For example,some steps set up data or other preconditions, some steps capture logginginformation, and some steps restore the system to a known good state at theend.
    • 44 2 Testing Processes A third kind of activity can apply during manual test execution. We can incorporate some degree of reactive testing into the procedures. One way to accomplish this is to leave the procedures somewhat vague and to tell the tester to select their favorite way of carrying out a certain task. Another way is to tell the testers, as Rex often does, that a test script is a road map to inter- esting places and, when they get somewhere interesting, they should stop and look around. This has the effect of giving them permission to transcend, to go beyond, the scripts. We have found it very effective. Finally, during execution, tools will run any existing automated tests. These tools follow the defined scripts without deviation. That can seem like an unalloyed “good thing” at first. However, if we did not design the scripts properly, that can mean that the script can get out of sync with the system under test and generate a bunch of false positives. We’ll talk more about that problem—and how to solve it—in chapter 9. 2.5.5 Running a Single Test Procedure Let’s zoom in on the act of a tester running a single test procedure. After the logistical issues of initial setup are handled, the tester starts running the specific steps of the test. These yield the actual results. Now we have come to the heart of test execution. We compare actual results with expected results. This is indeed the moment when testing either adds value or removes value from the project. Everything up to this point—all of the work designing and implementing our tests—was about getting us to this point. Everything after this point is about using the value this compari- son has delivered. Since running a test procedure is so critical, so central to good testing, attention and focus on your part is essential at this moment. But what if we notice a mismatch between the expected results and the actual results? The ISTQB glossary refers to each difference between the expected results and the actual results as an anomaly. There can be multiple differences, and thus multiple anomalies, in a mismatch. When we observe an anomaly, we have an incident. Some incidents are failures. A failure occurs when the system misbehaves due to one or more defects. This is the ideal situation when an incident has occurred. If we are looking at a failure—a symptom of a true defect—we should start to gather data to help the developer resolve the defect. We’ll talk more about incident reporting and management in detail in chapter 7.
    • 2.5 Test Implementation and Execution 45 ISTQB Glossary anomaly: Any condition that deviates from expectation based on require- ments specifications, design documents, user documents, standards, etc. or from someone’s perception or experience. Anomalies may be found during, but not limited to, reviewing, testing, analysis, compilation, or use of software products or applicable documentation. incident: Any event occurring that requires investigation. failure: Deviation of the component or system from its expected delivery, ser- vice, or result. anomaly incidentSome incidents are not failures but rather are false positives. False positives failureoccur when the expected and actual results don’t match due to bad test specifi-cations, invalid test data, incorrectly configured test environments, a simplemistake on the part of the person running the test, and so forth. If we can catch a false positive right away, the moment it happens, the dam-age is limited. The tester should fix the test, which might involve some configu-ration management work if the tests are checked into a repository. The testershould then rerun the test. Thus, the damage done was limited to the tester’swasted time along with the possible impact of that lost time on the scheduleplus the time needed to fix the test plus the time needed to rerun the test. All of those activities, all of that lost time, and the impact on the schedulewould have happened even if the tester had simply assumed the failure was validand reported it as such. It just would have happened later, after an additionalloss of time on the part of managers, other testers, developers, and so forth. Here’s a cautionary note on these false positives too. Just because a testhas never yielded a false positive before, in all the times it’s been run before,doesn’t mean you’re not looking at one this time. Changes in the test basis,the proper expected results, the test object, and so forth can obsolete or inval-idate a test specification.2.5.6 Logging Test ResultsMost testers like to run tests—at least the first few times they run them—butsometimes they don’t always like to log results. “Paperwork!” they snort.“Bureaucracy and red tape!” they protest.
    • 46 2 Testing Processes If you are one of those testers, get over it. We said previously that all the planning, analysis, design, and implementation was about getting to the point of running a test procedure and comparing actual and expected results. We then said that everything after that point is about using the value the compar- ison delivered. Well, you can’t use the value if you don’t capture it, and the test logs are about capturing the value. So remember that, as testers run tests, testers log results. Failure to log results means either doing the test over (most likely) or losing the value of running the tests. When you do the test over, that is pure waste, a loss of your time running the test. Since test execution is usually on the critical path for project completion, that waste puts the planned project end date at risk. Peo- ple don’t like that much. A side note here, before we move on. We mentioned reactive test strate- gies and the problems they have with coverage earlier. Note that, with ade- quate logging, while you can’t ascertain reactive test coverage in advance, at least you can capture it afterward. So again, log your results, both for scripted and unscripted tests. During test execution, there are many moving parts. The test cases might be changing. The test object and each constituent test item are often chang- ing. The test environment might be changing. The test basis might be chang- ing. Logging should identify the versions tested. The military strategist Clausewitz referred famously to the “fog of war.” What he meant was not a literal fog—though black-powder cannons and fire- arms of his day created plenty of that!—but rather a metaphorical fog whereby no one observing a battle, be they an infantryman or a general, could truly grasp the whole picture. Clausewitz would recognize his famous fog if he were to come back to life and work as a tester. Test execution periods tend to have a lot of fog. Good test logs are the fog-cutter. Test logs should provide a detailed, rich chronol- ogy of test execution. To do so, test logs need to be test by test and event by event. Each test, uniquely identified, should have status information logged against it as it goes through the test execution period. This information should support not only understanding the overall test status but also the overall test coverage.
    • 2.5 Test Implementation and Execution 47 ISTQB Glossary test control: A test management task that deals with developing and apply- ing a set of corrective actions to get a test project on track when monitoring shows a deviation from what was planned.You should also log events that occur during test execution and affect the testexecution process, whether directly or indirectly. You should document any-thing that delays, interrupts, or blocks testing. Test analysts are not always also test managers, but they should workclosely with the test managers. Test managers need logging information fortest control, test progress reporting, and test process improvement. Test ana-lysts need logging information too, along with the test managers, for measure-ment of exit criteria, which we’ll cover later in this chapter. Finally, note that the extent, type, and details of test logs will vary basedon the test level, the test strategy, the test tools, and various standards andregulations. Automated component testing results in the automated test gath-ering logging information. Manual acceptance testing usually involves the testmanager compiling the test logs or at least collating the information comingfrom the testers. If we’re testing regulated, safety-critical systems like pharma-ceutical systems, we might have to log certain information for audit purposes.2.5.7 Use of Amateur TestersAmateur testers. This term is rather provocative, so here’s what we mean. A per-son who primarily works as a tester to earn a living is a professional tester. Any-one else engaged in testing is an amateur tester. Rex is a professional tester now—and has been since 1987. Before that, hewas a professional programmer. Rex still writes programs from time to time,but now he’s an amateur programmer. He makes many typical amateur-pro-grammer mistakes when he does it. Before Rex was a professional tester, heunit-tested his code as a programmer. He made many typical amateur-testermistakes when he did that. Since one of the companies he worked for as aprogrammer relied entirely on programmer unit testing, that sometimesresulted in embarrassing outcomes for their customers—and for Rex.
    • 48 2 Testing Processes Jamie is also a professional tester, and he has been one since 1991. Jamie also does some programming from time to time, but he considers himself an amateur programmer. He makes many typical amateur-programmer mistakes when he does it, mistakes that full-time professional programmers often do not make. There are two completely differing mind-sets for testing and pro- gramming. Companies that rely entirely on programmer testing sometimes find that it results in embarrassing outcomes for their customers. Professional testers could have solved some of those problems. There’s nothing wrong with involving amateur testers. Sometimes, we want to use amateur testers such as users or customers during test execution. It’s important to understand what we’re trying to accomplish with this and why it will (or won’t) work. For example, often the objective is to build user confidence in the system, but that can backfire! Suppose we involve them too early, when the system is still full of bugs. Oops! 2.5.8 Standards Let’s look at some standards that relate to implementation and execution as well as to other parts of the test process. Let’s start with the IEEE 829 standard. Most of this material about IEEE 829 should be a review of the Foundation syllabus for you, but it might be a while since you’ve looked at it. The first IEEE 829 template, which we’d use during test implementation, is the IEEE 829 test procedure specification. A test procedure specification describes how to run one or more test cases. This template includes the following sec- tions: ■ Test procedure specification identifier ■ Purpose (e.g., which tests are run) ■ Special requirements (skills, permissions, environment, etc.) ■ Procedure steps (log, set up, start, proceed [the steps themselves], measure results, shut down/suspend, restart [if needed], stop, wrap up/tear down, contingencies) While the IEEE 829 standard distinguishes between test procedures and test cases, in practice test procedures are often embedded in test cases.
    • 2.5 Test Implementation and Execution 49 A test procedure is sometimes referred to as a test script. A test script canbe manual or automated. The IEEE 829 standard for test documentation also includes ideas on whatto include in a test log. According to the standard, a test log should record therelevant details about test execution. This template includes the following sec-tions:■ Test log identifier.■ Description of the testing, including the items under test (with version numbers), the test environments being used, and the like.■ Activity and event entries. These should be test by test and event by event. Events include things like test environments becoming unavailable, people being out sick, and so forth. You should capture information on the test exe- cution process; the results of the tests; environmental changes or issues; bugs, incidents, or anomalies observed; the testers involved; any suspension or blockage of testing; changes to the plan and the impact of change; and so forth.For those who find the IEEE 829 templates too daunting, simple spreadsheetscould also be used to capture execution logging information. The British Standards Institute produces the BS 7925/2 standard. It has twomain sections: test design techniques and test measurement techniques. For testdesign, it reviews a wide range of techniques, including black box, white box,and others. It covers the following black-box techniques that were also coveredin the Foundation syllabus:■ Equivalence partitioning■ Boundary value analysis■ State transition testingIt also covers a black-box technique called cause-effect graphing, which is agraphical version of a decision table, and a black-box technique called syntaxtesting. It covers the following white-box techniques that were also covered in theFoundation syllabus:■ Statement testing■ Branch and decision testing
    • 50 2 Testing Processes It also covers some additional white-box techniques that were covered only briefly or not at all in the Foundation syllabus: ■ Data flow testing ■ Branch condition testing ■ Branch condition combination testing ■ Modified condition decision testing ■ Linear Code Sequence and Jump (LCSAJ) testing Rounding out the list are two sections, “Random Testing” and “Other Testing Techniques.” Random testing was not covered in the Foundation syllabus, but we’ll talk about the use of randomness in relation to reliability testing in chapters 5 and 9. The section on other testing techniques doesn’t provide any examples but merely talks about rules on how to select them. You might be thinking, “Hey, wait a minute, that was too fast. Which of those do I need to know for the Advanced Level Technical Test Analyst (ATTA) exam?” The answer is in two parts. First, you need to know any test design technique that was on the Foundation syllabus. Such techniques may be covered on the Advanced Level Technical Test Analyst exam. Second, we’ll cover the new test design techniques that might be on the Advanced Level Test Analyst (ATA) exam in detail in chapter 4. BS 7925/2 provides one or more coverage metrics for each of the test measurement techniques. These are covered in the measurement part of the standard. The choice of organization for this standard is curious indeed because there is no clear reason why the coverage metrics weren’t covered at the same time as the design techniques. However, from the point of view of the ISTQB fundamental test process, perhaps it is easier that way. For example, our entry criteria might require some particular level of test coverage, as it would if we were testing safety-critical avionics software subject to the United States Federal Aviation Administration’s standard DO-178B. (We’ll cover that standard in a moment.) So during test design, we’d employ the test design techniques. During test implementation, we’d use the test measurement techniques to ensure adequate coverage. In addition to these two major sections, this document also includes two annexes. Annex B brings the dry material in the first two major sections to life by showing an example of applying them to realistic situations. Annex A covers
    • 2.5 Test Implementation and Execution 51process considerations, which is perhaps closest to our area of interest here. Itdiscusses the application of the standard to a test project, following a test pro-cess given in the document. To map that process to the ISTQB fundamental testprocess, we can say the following:■ Test analysis and design along with test implementation in the ISTQB process is equivalent to test specification in the BS 7925/2 process.■ BS 7925/2 test execution, logically enough, corresponds to test execution in the ISTQB process. Note, though, that the ISTQB process includes that as part of a larger activity, test implementation and execution. Note also that the ISTQB process includes test logging as part of test execution, while BS 7925/2 has a separate test recording process.■ Finally, BS 7925/2 has checking for test completion as the final step in its process. That corresponds roughly to the ISTQB’s evaluating test criteria and reporting.Finally, as promised, let’s talk about the DO-178B standard. This standard ispromulgated by the United States Federal Aviation Administration. As youmight guess, it’s for avionics systems. In Europe, it’s called ED-12B. The stan-dard assigns a criticality level, based on the potential impact of a failure. Basedon the criticality level, a certain level of white-box test coverage is required, asshown in table 2-1.Table 2–1 FAA DO/178B mandated coverageCriticality Potential Failure Impact Required CoverageLevel A: Software failure can result in a Modified Condition/Decision,Catastrophic catastrophic failure of the system. Decision, and StatementLevel B: Hazardous/ Software failure can result in a hazardous Decision and StatementSevere or severe/major failure of the system.Level C: Software failure can result in a major StatementMajor failure of the system.Level D: Software failure can result in a minor NoneMinor failure of the system.Level E: Software failure cannot have an effect on NoneNo Effect the system.
    • 52 2 Testing Processes Let us explain table 2-1 a bit more thoroughly: Criticality level A, or Catastrophic, applies when a software failure can result in a catastrophic failure of the system. For software with such criti- cality, the standard requires Modified Condition/Decision, Decision, and Statement coverage. Criticality level B, or Hazardous and Severe, applies when a software failure can result in a hazardous, severe, or major failure of the system. For software with such criticality, the standard requires Decision and Statement coverage. Criticality level C, or Major, applies when a software failure can result in a major failure of the system. For software with such criticality, the standard requires only Statement coverage. Criticality level D, or Minor, applies when a software failure can only result in a minor failure of the system. For software with such criticality, the standard does not require any level of coverage. Finally, criticality level E, or No Effect, applies when a software fail- ure cannot have an effect on the system. For software with such critical- ity, the standard does not require any level of coverage. This makes a certain amount of sense. You should be more concerned about software that affects flight safety, such as rudder and aileron control modules, than about software that doesn’t, such as video entertainment systems. How- ever, there is a risk of using a one-dimensional white-box measuring stick to determine how much confidence we should have in a system. Coverage metrics are a measure of confidence, it’s true, but we should use multiple coverage met- rics, both white box and black box. By the way, if you found this a bit confusing, note that two of the white- box coverage metrics we mentioned, statement and decision coverage, were discussed in the Foundation syllabus, in chapter 4. Modified condition/deci- sion coverage was mentioned briefly but not described in any detail in the Foundation; Modified condition/decision coverage will be covered in detail in this book. If you don’t remember what statement and decision coverage mean, you should go back and review the material in that chapter on white-box cov- erage metrics. We’ll return to black-box and white-box testing and test cover- age in chapter 4 of this book, and we’ll assume that you understand the
    • 2.6 Evaluating Exit Criteria and Reporting 53relevant Foundation-level material. Chapter 4 will be hard to follow if you arenot fully conversant with the Foundation-level test design techniques.2.5.9 MetricsFinally, what metrics and measurements can we use for the test implementationand execution of the ISTQB fundamental test process? Different people use dif-ferent metrics, of course. Typical metrics during test implementation are the percentage of testenvironments configured, the percentage of test data records loaded, and thepercentage of test cases automated. During test execution, typical metrics look at the percentage of test condi-tions covered, test cases executed, and so forth. We can track test implementation and execution tasks against a work-breakdown-structure, which is useful in determining whether we are proceed-ing according to the estimate and schedule. Note that here we are discussing metrics to measure the completeness ofthese processes; i.e., the progress we have made. You should use a different setof metrics for test reporting.2.6 Evaluating Exit Criteria and Reporting Learning objectives (K3) Determine from a given set of measures if a test completion criterion has been fulfilled.In one sense, the evaluation of exit criteria and reporting of results is a test man-agement activity. And, in the volume on advanced test management, we exam-ine ways to analyze, graph, present, and report test results as part of the testprogress monitoring and control activities. However, in another sense, there is a key piece of this process thatbelongs to the technical test analyst. As technical test analysts, in this book weare more interested in two main areas. First, we need to collect the informa-tion necessary to support test management reporting of results. Second, weneed to measure progress toward completion, and by extension, we need to
    • 54 2 Testing Processes detect deviation from the plan. (Of course, if we do detect deviation, it is the test manager’s role, as part of the control activities, to get us back on track.) To measure completeness of the testing with respect to exit criteria, and to generate information needed for reporting, we can measure properties of the test execution process such as the following: ■ Number of test conditions, cases, or test procedures planned, executed, passed, and failed ■ Total defects, classified by severity, priority, status, or some other factor ■ Change requests proposed, accepted, and tested ■ Planned versus actual costs, schedule, effort ■ Quality risks, both mitigated and residual ■ Lost test time due to blocking events ■ Confirmation and regression test results We can track evaluation and reporting milestones and tasks against a work- breakdown-structure, which is useful in determining whether we are proceed- ing according to the estimate and schedule. We’ll now look at examples of test metrics and measures that you can use to evaluate where you stand with respect to exit criteria. Most of these are drawn from actual case studies, projects where RBCS helped a client with their testing. 2.6.1 Test Suite Summary Figure 2-3 shows a test suite summary worksheet. Such a worksheet summa- rizes the test-by-test logging information described in a previous section. As a test analyst, you can use this worksheet to track a number of important proper- ties of the test execution so far: ■ Test case progress ■ Test case pass/fail rates ■ Test suite defect priority/severity (weighted failure) ■ Earned value As such, it’s a useful chart for the test analyst to understand the status of the entire test effort.
    • 2.6 Evaluating Exit Criteria and Reporting 55 Test Suite Summary Test Pass Two Total Planned Tests Fullfilled Weighted Planned Tests Unfullfilled Earned ValueSuite Cases Count Skip Pass Fail Failure Count Queued IP Block Plan Actual %Effort %Exec Hrs Hrs Functionality 14 14 0 4 10 10.6 0 0 0 0 36.0 49.0 163% 100% Performance 5 4 0 1 3 8.7 1 1 0 0 7.0 11.5 164% 80% Reliability 2 2 2 0 0 0.0 0 0 0 0 0.0 0.0 0% 0% Robustness 3 1 0 0 1 0.5 2 2 0 0 12.5 8.0 64% 33% Installation 4 0 0 0 0 0.0 4 4 0 0 72.0 0.0 0% 0% Localization 8 1 0 0 1 0.5 7 0 7 0 128.0 22.0 17% 13% Security 4 2 0 2 0 1.1 2 2 0 0 17.0 8.5 50% 50%Documentation 3 1 0 0 1 4.0 2 2 0 0 28.0 15.0 54% 33% Integration 4 4 0 3 1 1.0 0 0 0 0 8.0 12.5 156% 100% Usability 2 0 0 0 0 0.0 2 0 2 0 16.0 0.0 0% 0% Exploratory 6 0 0 0 0.0 6 6 0 0 12.0 0.0 0% 0% Total 55 29 2 10 17 26.4 26 17 9 0 336.5 126.5 38% 51% Percent 100% 53% 4% 18% 31% N/A 47% 31% 16% 0%Figure 2–3 Test suite summary worksheetDown the left side of figure 2-3, you see two columns, the test suite name andthe number of test cases it contains. Again, in some test log somewhere, we havedetailed information for each test case. On the middle-left side, you see four columns under the general headingof “Planned Tests Fulfilled.” These are the tests for which no more workremains, at least during this pass of testing. The weighted failure numbers for each test suite, found in the column aboutin the middle of the table, give a metric of historical bug finding effectiveness.Each bug is weighted by priority and severity—only the severity one, priorityone bugs, count for a full weighted failure point, while lower severity and prior-ity can reduce the weighted failure point count for a bug to as little 0.04. So thisis a metric of the technical risk, the likelihood of finding problems, associatedwith each test suite based on historical bug finding effectiveness. On the middle-right side, you see four columns under the general headingof “Planned Tests Unfulfilled.” These are the tests for which more work remainsduring this pass of testing. IP, by the way, stands for “in progress.” Finally, on the right side you see four columns under the general headingof “Earned Value.” Earned value is a simple project management concept. It
    • 56 2 Testing Processes says that, in a project, we accomplish tasks by expending resources. So if the percentage of tasks accomplished is about equal to the percentage of resources expended, we’re on track. If the percentage of tasks accomplished is greater than the percentage of resources expended, we’re on our way to being under budget. If the percentage of tasks accomplished is less than the percentage of resources expended, we’re on our way to being over budget. Similarly, from a schedule point of view, the percentage of tasks accom- plished should be approximately equal to the percentage of project time elapsed. As with effort, if the tasks percentage is over the schedule percent- age, we’re happy. If the tasks percentage is below the schedule percentage, we’re sad and worried. In test execution, we can consider the test case or test procedure to be our basic task. The resources—for manual testing, anyway—are usually the per- son-hours required to run the tests. That’s the way this chart shows earned value, test suite by test suite. Take a moment to study figure 2-3. Think about how you might be able to use it on your current (or future) projects. 2.6.2 Defect Breakdown We can analyze defect (or bug) reports in a number of ways. There’s just about no end to the analysis that we have done as technical test analysts, test manag- ers, and test consultants. As a test professional, you should think of good, proper analysis of bug reports much the same way as a doctor thinks of good, proper analysis of blood samples. Just one of these many ways to examine bug reports is by looking at the breakdown of the defects by severity, priority, or some combination of the two. If you use a numerical scale for severity and priority, it’s easy to multiply them together to get a weighted metric of overall bug importance, as you just saw. Figure 2-4 shows an example of such a chart. What can we do with it? Well, for one thing, we can compare this chart with previous projects to get a sense of whether we’re in better or worse shape. Remember, though, that the distribution will change over the life of the project. Ideally, the chart will start skewed toward high-priority bugs—at least it will if we’re doing proper risk-based testing, which we’ll discuss in chapter 3.
    • 2.6 Evaluating Exit Criteria and Reporting 57 700 600 622 500Number of Reports 400 300 200 100 136 82 19 0 0 1 2 3 4 5 Severity of DefectFigure 2–4 Breakdown of defects by severityFigure 2-4 shows lots of severity ones and twos. Usually, severity one is loss ofdata. Severity two is loss of functionality without a workaround. Either way, badnews. This chart tells us, as test professionals, to take a random sample of 10 or20 of these severity one and two reports and see if we have severity inflationgoing on here. Are our severity classifications accurate? If not, the poor testmanager’s reporting will be biased and alarmist, which will get her in trouble. As we mentioned previously, though, if we’re doing risk-based testing, thisis probably about how this chart should look during the first quarter or so ofthe project. Find the scary stuff first, we always tell testers. If these severityclassifications are right, the test team is doing just that.2.6.3 Confirmation Test Failure RateAs technical test analysts, we can count on finding some bugs. Hopefully manyof those bugs will be sent back to us, allegedly fixed. At that point, we confirma-tion-test the fix. How many of those fixes fail the confirmation test? It can feellike it’s quite a few, but is it really? Figure 2-5 shows the answer, graphically, foran actual project.
    • 58 2 Testing Processes Case Study Banking Application Bugs Confirmation Test Failure Analysis 1000 100% 99% 99% 100% 100% 100% 100% 100% 100% 100% 96% 83% 80% 100 Number of Bugs Opened 60% 10 710 40% 112 26 1 6 5 20% 1 0 0 0 0 0 0 0% 1 2 3 4 5 6 7 8 9 10 11 Bug Count % Cum Opened Count Figure 2–5 Confirmation test failure analysis On this banking project, we can see that quite a few bug fixes failed the confir- mation test. We had to reopen fully one in six defect reports at least once. That’s a lot of wasted time. It’s also a lot of potential schedule delay. Study figure 2-5, thinking about ways you could use this information during test execution. 2.6.4 System Test Exit Review Finally, let’s look at another case study. Figure 2-6 shows an excerpt of the exit criteria for an Internet appliance project that RBCS provided testing for. You’ll see that we have graded the criteria as part of a system test exit review. Each of the three criteria here is graded on a three-point scale: ■ Green: Totally fulfilled, with little remaining risk ■ Yellow: Not totally fulfilled, but perhaps an acceptable risk ■ Red: Not in any sense fulfilled, and poses a substantial risk Of course, you’d want to provide additional information and data for the yellows and the reds.
    • 2.6 Evaluating Exit Criteria and Reporting 59 System Test Exit Review Per the Test Plan, System Test was planned to end when following criteria were met: 1. All design, implementation, and feature completion, code completion, and unit test completion commitments made in the System Test Entry meeting were either met or slipped to no later than four (4), three (3), and three (3) weeks, respectively, prior to the proposed System Test Exit date. STATUS: RED. Audio and demo functionality have entered System Test in the last three weeks. The modems entered System Test in the last three weeks. On the margins of a violation, off-hook detection was changed significantly. 2. No panic, crash, halt, wedge, unexpected process termination, or other stoppage of processing has occurred on any server software or hardware for the previous three (3) weeks. STATUS: YELLOW. The servers have not crashed, but we did not complete all the tip-over and fail-over testing we planned, and so we are not satisfied that the servers are stable under peak load or other inclement conditions. 3. Production Devices have been used for all System Test execution for at least three (3) weeks. STATUS: GREEN. Except for the modem situation discussed above, the hardware has been stable.Figure 2–6 Case study of system test exit review2.6.5 StandardsFinally, let’s look at one more IEEE 829 template, one that applies to this part ofthe test process. The IEEE 829 standard for test documentation includes atemplate for test summary reports. A test summary report describes the results of a given level or phase of test-ing. The IEEE 829 template includes the following sections:■ Test summary report identifier■ Summary (e.g., what was tested, what the conclusions are, etc.)■ Variances (from plan, cases, procedures)■ Comprehensive assessment■ Summary of results (e.g., final metrics, counts)■ Evaluation (of each test item vis-à-vis pass/fail criteria)■ Summary of activities (resource use, efficiency, etc.)■ ApprovalsThe summaries can be delivered during test execution as part of a project statusreport or meeting. They can also be used at the end of a test level as part of testclosure activities.
    • 60 2 Testing Processes ISTQB Glossary test closure: During the test closure phase of a test process, data is collected from completed activities to consolidate experience, testware, facts, and num- bers. The test closure phase consists of finalizing and archiving the testware and evaluating the test process, including preparation of a test evaluation report. test summary report: A document summarizing testing activities and results. It also contains an evaluation of the corresponding test items against exit crite- ria. 2.6.6 Evaluating Exit Criteria and Reporting Exercise Consider the complete set of actual exit criteria from the Internet appliance project, which is shown in the following subsection. Toward the end of the project, the test team rated each criterion on the following scale: Green: Totally fulfilled, with little remaining risk Yellow: Not totally fulfilled, but perhaps an acceptable risk Red: Not in any sense fulfilled, and poses a substantial risk You can see the ratings we gave each criterion in the STATUS block below the criterion itself. We used this evaluation of the criteria as an agenda for a System Test Exit Review meeting. Rex led the meeting and walked the team through each cri- terion. As you can imagine, the RED ones required more explanation than the YELLOW and GREEN ones. While narrative explanation is provided for each evaluation, perhaps more information and data are needed. So, for each criterion, determine what kind of data and other information you’d want to collect to support the con- clusions shown in the status evaluations for each. 2.6.7 System Test Exit Review Per the Test Plan, System Test was planned to end when the following criteria were met:
    • 2.6 Evaluating Exit Criteria and Reporting 611. All design, implementation, and feature completion, code completion, and unit test completion commitments made in the System Test Entry meeting were either met or slipped to no later than four (4), three (3), and three (3) weeks, respectively, prior to the proposed System Test Exit date. STATUS: RED. Audio and demo functionality have entered System Test in the last three weeks. The modems entered System Test in the last three weeks. On the margins of a violation, off-hook detection was changed sig- nificantly.2. No panic, crash, halt, wedge, unexpected process termination, or other stoppage of processing has occurred on any server software or hardware for the previous three (3) weeks. STATUS: YELLOW. The servers have not crashed, but we did not complete all the tip-over and fail-over testing we planned, and so we are not satisfied that the servers are stable under peak load or other inclement conditions.3. Production devices have been used for all System Test execution for at least three (3) weeks. STATUS: GREEN. Except for the modem situation discussed above, the hardware has been stable.4. No client systems have become inoperable due to a failed update for at least three (3) weeks. STATUS: YELLOW. No system has become permanently inoperable during update, but we have seen systems crash during update and these systems required a reboot to clear the error.5. Server processes have been running without installation of bug fixes, man- ual intervention, or tuning of configuration files for two (2) weeks. STATUS: RED. Server configurations have been altered by Change Com- mittee–approved changes multiple times over the last two weeks.6. The Test Team has executed all the planned tests against the release-candi- date hardware and software releases of the Device, Server, and Client. STATUS: RED. We had planned to test Procurement and Fulfillment, but disengaged from this effort because the systems were not ready. Also, we have just received the release-candidate build; complete testing would take two weeks. In addition, the servers are undergoing Change Committee– approved changes every few days and a new load balancer has been added to the server farm. These server changes have prevented volume, tip-over,
    • 62 2 Testing Processes and fail-over testing for the last week and a half. Finally, we have never had a chance to test the server installation and boot processes because we never received documentation on how to perform these tasks. 7. The Test Team has retested all fixes for priority one and two bug reports over the life of the project against the release-candidate hardware and soft- ware releases of the Device, Server, and Client. STATUS: RED. Testing of the release-candidate software and hardware has been schedule-limited to one week, which does not allow for retesting of all bug fixes. 8. The Development Teams have resolved all “must-fix” bugs. “Must-fix” will be defined by the Project Management Team. STATUS: RED. Referring to the attached open/closed charts and the “Bugs Found Since 11/9” report, we continue to find new bugs in the product, though there is good news in that the find rate for priority one bugs has lev- eled off. Per the closure period charts, it takes on average about two weeks—three weeks for priority one bugs—to close a problem report. In addition, both open/close charts show a significant quality gap between cumulative open and cumulative closed, and it’s hard to believe that taken all together, a quantity of bugs that significant doesn’t indicate a pervasive fit-and-finish issue with the product. Finally, note that WebGuide and E- commerce problems are design issues—the selected browser is basically incompatible with much of the Internet—which makes these problems much more worrisome. 9. The Test Team has checked that all issues in the bug tracking system are either closed or deferred and, where appropriate, verified by regression and confirmation testing. STATUS: RED. A large quality gap exists and has existed for months. Because of the limited test time against the release-candidate build, the risk of regression is significant. 10. The open/close curve indicates that we have achieved product stability and reliability. STATUS: RED. The priority-one curve has stabilized, but not the overall bug-find curve. In addition, the run chart of errors requiring a reboot shows that we are still showing about one crash per eight hours of system operation, which is no more stable than a typical Windows 95/Windows 98 laptop. (One of the ad claims is improved stability over a PC.)
    • 2.6 Evaluating Exit Criteria and Reporting 6311. The Project Management Team agrees that the product, as defined during the final cycle of System Test, will satisfy the customer’s reasonable expecta- tions of quality. STATUS: YELLOW. We have not really run enough of the test suite at this time to give a good assessment of overall product quality.12. The Project Management Team holds a System Test Exit Meeting and agrees that we have completed System Test. STATUS: In progress.2.6.8 Evaluating Exit Criteria and Reporting Exercise DebriefWe have added a section called “ADDITIONAL DATA AND INFORMATION”below each criterion. In that section, you’ll find Rex’s solution to this exercise,based both on what kind of additional data and information he actually hadduring this meeting and what he would have brought if he knew then what heknows now.1. All design, implementation, and feature completion, code completion, and unit test completion commitments made in the System Test Entry meeting were either met or slipped to no later than four (4), three (3), and three (3) weeks, respectively, prior to the proposed System Test Exit date. STATUS: RED. Audio and demo functionality have entered System Test in the last three weeks. The modems entered System Test in the last three weeks. On the margins of a violation, off-hook detection was changed sig- nificantly. ADDITIONAL DATA AND INFORMATION: The specific commitments made in the System Test Entry meeting. The delivery dates for the audio functionality, the demo functionality, the modem, and the off-hook detec- tion functionality.2. No panic, crash, halt, wedge, unexpected process termination, or other stoppage of processing has occurred on any server software or hardware for the previous three (3) weeks. STATUS: YELLOW. The servers have not crashed, but we did not complete all the tip-over and fail-over testing we planned, and so we are not satisfied that the servers are stable under peak load or other inclement conditions. ADDITIONAL DATA AND INFORMATION: Metrics indicate the per- centage completion of tip-over and fail-over tests. Details on which specific
    • 64 2 Testing Processes quality risks remain uncovered due to the tip-over and fail-over tests not yet run. 3. Production devices have been used for all System Test execution for at least three (3) weeks. STATUS: GREEN. Except for the modem situation discussed above, the hardware has been stable. ADDITIONAL DATA AND INFORMATION: None. Good news requires no explanation. 4. No client systems have become inoperable due to a failed update for at least three (3) weeks. STATUS: YELLOW. No system has become permanently inoperable during update, but we have seen systems crash during update and these systems required a reboot to clear the error. ADDITIONAL DATA AND INFORMATION: Details, from bug reports, on the system crashes described. 5. Server processes have been running without installation of bug fixes, man- ual intervention, or tuning of configuration files for two (2) weeks. STATUS: RED. Server configurations have been altered by Change Com- mittee–approved changes multiple times over the last two weeks. ADDITIONAL DATA AND INFORMATION: List of tests that have been run prior to the last change, along with an assessment of the risk posed to each test by the change. (Note: Generating this list and the assessment could be a lot of work unless you have good traceability information.) 6. The Test Team has executed all the planned tests against the release-candi- date hardware and software releases of the Device, Server, and Client. STATUS: RED. We had planned to test Procurement and Fulfillment, but disengaged from this effort because the systems were not ready. Also, we have just received the release-candidate build; complete testing would take two weeks. In addition, the servers are undergoing Change Committee– approved changes every few days and a new load balancer has been added to the server farm. These server changes have prevented volume, tip-over, and fail-over testing for the last week and a half. Finally, we have never had a chance to test the server installation and boot processes because we never received documentation on how to perform these tasks. ADDITIONAL DATA AND INFORMATION: List of Procurement and
    • 2.6 Evaluating Exit Criteria and Reporting 65 Fulfillment tests skipped, along with the risks associated with those tests. List of tests that will be skipped due to time compression of the last pass of testing against the release-candidate, along with the risks associated with those tests. List of changes to the server since the last volume, tip-over, and fail-over tests along with an assessment of reliability risks posed by the change. (Again, this could be a big job.) List of server install and boot pro- cess tests skipped, along with the risks associated with those tests.7. The Test Team has retested all priority one and two bug reports over the life of the project against the release-candidate hardware and software releases of the Device, Server, and Client. STATUS: RED. Testing of the release-candidate software and hardware has been schedule-limited to one week, which does not allow for retesting of all bugs. ADDITIONAL DATA AND INFORMATION: The list of all the priority one and two bug reports filed during the project, along with an assessment of the risk that those bugs might have re-entered the system in a change- related regression not otherwise caught by testing. (Again, potentially a huge job.)8. The Development Teams have resolved all “must-fix” bugs. “Must-fix” will be defined by the Project Management Team. STATUS: RED. Referring to the attached open/closed charts and the “Bugs Found Since 11/ 9” report, we continue to find new bugs in the product, though there is good news in that the find rate for priority one bugs has lev- eled off. Per the closure period charts, it takes on average about two weeks—three weeks for priority one bugs—to close a problem report. In addition, both open/close charts show a significant quality gap between cumulative open and cumulative closed, and it’s hard to believe that taken all together, a quantity of bugs that significant doesn’t indicate a pervasive fit-and-finish issue with the product. Finally, note that Web and E-com- merce problems are design issues—the selected browser is basically incom- patible with much of the Internet—which makes these problems much more worrisome. ADDITIONAL DATA AND INFORMATION: Open/closed charts, list of bugs since November 9, closure period charts, and a list of selected impor-
    • 66 2 Testing Processes tant sites that won’t work with the browser. (Note: The two charts men- tioned are covered in the Advanced Test Manager course, if you’re curious.) 9. The Test Team has checked that all issues in the bug tracking system are either closed or deferred and, where appropriate, verified by regression and confirmation testing. STATUS: RED. A large quality gap exists and has existed for months. Because of the limited test time against the release-candidate build, the risk of regression is significant. ADDITIONAL DATA AND INFORMATION: List of bug reports that are neither closed nor deferred, sorted by priority. Risk of tests that will not be run against the release-candidate software, along with the associated risks for each test. 10. The open/close curve indicates that we have achieved product stability and reliability. STATUS: RED. The priority-one curve has stabilized, but not the overall bug-find curve. In addition, the run chart of errors requiring a reboot shows that we are still showing about one crash per eight hours of system operation, which is no more stable than a typical Windows 95/Windows 98 laptop. (One of the ad claims is improved stability over a PC.) ADDITIONAL DATA AND INFORMATION: Open/closed chart (run for priority one defects only and again for all defects). Run chart of errors requiring a reboot; i.e., a trend chart that shows how many reboot-requiring crashes occurred each day. 11. The Project Management Team agrees that the product, as defined during the final cycle of System Test, will satisfy the customer’s reasonable expecta- tions of quality. STATUS: YELLOW. We have not really run enough of the test suite at this time to give a good assessment of overall product quality. ADDITIONAL DATA AND INFORMATION: List of all the tests not yet run against the release-candidate build, along with their associated risks. 12. The Project Management Team holds a System Test Exit Meeting and agrees that we have completed System Test. STATUS: In progress. ADDITIONAL DATA AND INFORMATION: None.
    • 2.7 Test Closure Activities 672.7 Test Closure Activities Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section. In thecourse of studying for the exam, read this section in chapter 2 of the Advancedsyllabus for general recall and familiarity only.2.8 Sample Exam QuestionsTo end each chapter, you can try one or more sample exam questions toreinforce your knowledge and understanding of the material and to prepare forthe ISTQB Advanced Level Technical Test Analyst exam.1 Identify all of the following that can be useful as a test oracle the first time a test case is run. A Incident report B Requirements specification C Test summary report D Legacy system2 Assume you are a technical test an=2alyst working on a banking project to upgrade an existing automated teller machine system to allow customers to obtain cash advances from supported credit cards. During test design, you identify a discrepancy between the list of supported credit cards in the requirements specification and the design specification. This is an example of what? A Test design as a static test technique B A defect in the requirements specification C A defect in the design specification D Starting test design too early in the project
    • 68 2 Testing Processes 3 Which of the following is not always a precondition for test execution? A A properly configured test environment B A thoroughly specified test procedure C A process for managing identified defects D A test oracle 4 Assume you are a technical test analyst working on a banking project to upgrade an existing automated teller machine system to allow customers to obtain cash advances from supported credit cards. One of the exit criteria in the test plan requires documentation of successful cash advances of at least 500 euros for all supported credit cards. The correct list of supported credit cards is American Express, Visa, Japan Credit Bank, Eurocard, and MasterCard. After test execution, a complete list of cash advance test results shows the following: ■ American Express allowed advances of up to 1,000 euros. ■ Visa allowed advances of up to 500 euros. ■ Eurocard allowed advances of up to 1,000 euros. ■ MasterCard allowed advances of up to 500 euros. Which of the following statements is true? A. The exit criterion fails due to excessive advances for American Express and Eurocard. B. The exit criterion fails due to a discrepancy between American Express and Eurocard on the one hand and Visa and MasterCard on the other hand. C. The exit criterion passes because all supported cards allow cash advances of at least the minimum required amount. D. The exit criterion fails because we cannot document Japan Credit Bank results.
    • 693 Test Management Wild Dog...said, “I will go up and see and look, and say; for I think it is good. Cat, come with me.’” “Nenni!” said the Cat. “I am the Cat who walks by himself, and all places are alike to me. I will not come.” Rudyard Kipling, from his story, The Cat That Walked by Himself, a colorful illustration of the reasons for the phrase “herding cats,” which is often applied to describe the task of managing software engineering projects.The third chapter of the Advanced syllabus is concerned with test management.It discusses test management activities from the start to the end of the test pro-cess and introduces the consideration of risk for testing. There are 11 sections.1. Introduction2. Test Management Documentation3. Test Plan Documentation Templates4. Test Estimation5. Scheduling and Test Planning6. Test Progress Monitoring and Control7. Business Value of Testing8. Distributed, Outsourced, and Insourced Testing9. Risk-Based Testing10. Failure Mode and Effects Analysis11. Test Management IssuesLet’s look at the sections that pertain to technical test analysis.
    • 70 3 Test Management ISTQB Glossary test management: The planning, estimating, monitoring and control of test activities, typically carried out by a test manager. 3.1 Introduction Learning objectives Recall of content only This chapter, as the name indicates, is focused primarily on test management topics. Thus, it is mainly the purview of Advanced Test Manager exam candi- dates. Since this book is for technical test analysts, most of our coverage in this chapter is to support simple recall. However, there is one key area that, as a technical test analyst, you need to understand very well: risk-based testing. In this chapter, you’ll learn how to per- form risk analysis, to allocate test effort based on risk, and to sequence tests according to risk. These are the key tasks for a technical test analyst doing risk- based testing. This chapter in the Advanced syllabus also covers test documentation tem- plates for test managers. It focuses on the IEEE 829 standard. As a technical test analyst, you’ll need to know that standard as well. If you plan to take the Advanced Level Technical Test Analyst exam, remember that all material from the Foundation syllabus, including that related to the use of the IEEE 829 tem- plates and the test management material in chapter 5 of the Foundation sylla- bus, is examinable. 3.2 Test Management Documentation Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section.
    • 3.3 Test Plan Documentation Templates 71 ISTQB Glossary test policy: A high-level document describing the principles, approach, and major objectives of the organization regarding testing. test strategy: A high-level description of the test levels to be performed and the testing within those levels for an organization or program (one or more projects). test level: A group of test activities that are organized and managed together. A test level is linked to the responsibilities in a project. Examples of test levels are component test, integration test, system test, and acceptance test. level test plan: A test plan that typically addresses one test level. See also test plan. master test plan: A test plan that typically addresses multiple test levels. See also test plan. test plan: A document describing the scope, approach, resources, and schedule of intended test activities. It identifies amongst others test items, the features to be tested, the testing tasks, who will do each task, degree of tester independence, the test environment, the test design techniques and entry and exit criteria to be used (and the rationale for their choice), and any risks requiring contingency planning. It is a record of the test planning process.However, the Foundation syllabus covered test management, including thetopic of test management documentation. Anything covered in the Foundationsyllabus is examinable. In addition, all sections of the Advanced syllabus areexaminable in terms of general recall. So, if you are studying for the exam, you’ll want to read this section in chap-ter 3 of the Advanced syllabus for general recall and familiarity and review thetest management documentation material from the Foundation syllabus.3.3 Test Plan Documentation Templates Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section.
    • 72 3 Test Management ISTQB Glossary test estimation: The calculated approximation of a result related to various aspects of testing (e.g., effort spent, completion date, costs involved, number of test cases, etc.) which is usable even if input data may be incomplete, uncer- tain, or noisy. test schedule: A list of activities, tasks, or events of the test process, identify- ing their intended start and finish dates and/or times and interdependencies. test point analysis (TPA): A formula-based test estimation method based on function point analysis. Wideband Delphi: An expert-based test estimation technique that aims at making an accurate estimation using the collective wisdom of the team mem- bers. [Note: The glossary spells this as Wide Band Delphi, though the spelling given here is more commonly used.] Earlier, in chapter 2, we reviewed portions of the IEEE 829 test documentation standard. Specifically, we looked at the test design specification template, the test case specification template, the test procedure specification template, the test summary report template, and the test log template. However, the Founda- tion covered the entire IEEE 829 standard, including the test plan template, test item transmittal report template, and the incident report template. Anything covered in the Foundation syllabus is examinable. All sections of the Advanced syllabus are examinable in terms of general recall. If you are studying for the exam, you’ll want to read this section in chapter 3 of the Advanced syllabus for general recall and familiarity and review the test manage- ment documentation material from the Foundation syllabus. 3.4 Test Estimation Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section.
    • 3.5 Scheduling and Test Planning 73 However, chapter 5 of the Foundation syllabus covered test estimation aspart of the material on test management. Anything covered in the Foundationsyllabus is examinable. All sections of the Advanced syllabus are examinable interms of general recall. If you are studying for the exam, you’ll want to read thissection in chapter 3 of the Advanced syllabus for general recall and familiarityand review the test estimation material from the Foundation syllabus.3.5 Scheduling and Test Planning Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section. In thecourse of studying for the exam, read this section in chapter 3 of the Advancedsyllabus for general recall and familiarity only.3.6 Test Progress Monitoring and Control Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section. However, chapter 5 of the Foundation syllabus covers test progress moni-toring and control as part of the material on test management. Anything cov-ered in the Foundation syllabus is examinable. All sections of the Advancedsyllabus are examinable in terms of general recall. If you are studying for theexam, you’ll want to read this section in chapter 3 of the Advanced syllabus forgeneral recall and familiarity and review the test progress monitoring and con-trol material from the Foundation syllabus.
    • 74 3 Test Management ISTQB Glossary test monitoring: A test management task that deals with the activities related to periodically checking the status of a test project. Reports are prepared that compare the actuals to that which was planned. See also test management. 3.7 Business Value of Testing Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section. In the course of studying for the exam, read this section in chapter 3 of the Advanced syllabus for general recall and familiarity only. 3.8 Distributed, Outsourced, and Insourced Testing Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section. In the course of studying for the exam, read this section in chapter 3 of the Advanced syllabus for general recall and familiarity only.
    • 3.9 Risk-Based Testing 753.9 Risk-Based Testing Learning objectives (K2) Outline the activities of a risk-based approach for planning and executing technical testing. [Note: While the Advanced syllabus does not include K3 or K4 learning objectives for technical test analysts (unlike for test analysts) in this section, we feel that technical test analysts have an equal need for the ability to perform a risk analysis. Therefore, we will include exercises on this topic from a technical perspective.]Risk is the possibility of a negative or undesirable outcome or event. A specificrisk is any problem that may occur that would decrease customer, user, partici-pant, or stakeholder perceptions of product quality or project success. In testing, we’re concerned with two main types of risks. The first type ofrisk is product or quality risk. When the primary impact of a potential problemis on product quality, such potential problems are called product risks. A syn-onym for product risks, which we use most frequently ourselves, is quality risks.An example of a quality risk is a possible reliability defect that could cause a sys-tem to crash during normal operation. The second type of risk is project or planning risks. When the primaryimpact of a potential problem is on project success, such potential problems arecalled project risks. Some people also refer to project risks as planning risks. Anexample of a project risk is a possible staffing shortage that could delay comple-tion of a project. Of course, you can consider a quality risk as a special type of project risk.While the ISTQB definintion of project risk is given here, Jamie likes the infor-mal definition of a project risk as anything that might prevent the project fromdelivering the right product, on time and on budget. However, the difference isthat you can run a test against the system or software to determine whether aquality risk has become an actual outcome. You can test for system crashes, forexample. Other project risks are usually not testable. You can’t test for a staffingshortage.
    • 76 3 Test Management ISTQB Glossary risk: A factor that could result in future negative consequences; usually expressed as impact and likelihood. risk type: A set of risks grouped by one or more common factors such as a quality attribute, cause, location, or potential effect of risk. A specific set of product risk types is related to the type of testing that can mitigate (control) that risk type. For example, the risk of user interactions being misunderstood can be mitigated by usability testing. risk category: see risk type. [Note: Included because this is also a common term, and because the process of risk analysis as described in the Advanced syllabus includes risk categorization.] product risk: A risk directly related to the test object. See also risk. [Note: We will also use the commonly used synonym quality risk in this book.] project risk: A risk related to management and control of the (test) project, e.g., lack of staffing, strict deadlines, changing requirements, etc. See also risk. risk-based testing: An approach to testing to reduce the level of product risks and inform stakeholders of their status, starting in the initial stages of a project. It involves the identification of product risks and the use of risk levels to guide the test process. Not all risks are equal in importance. There are a number of ways to classify the level of risk. The simplest is to look at two factors: ■ The likelihood of the problem occurring; i.e., being present in the product when it is delivered for testing ■ The impact of the problem should it occur; i.e., being present in the product when it is delivered to customers or users after testing Note the distinction made here in terms of project timeline. Likelihood is assessed based on the likelihood of a problem existing in the software, not on the likelihood of it being encountered by the user. The likelihood of a user encountering the problem influences impact. Likelihood of a problem arises primarily from technical considerations, such as the programming languages used, the bandwidth of connections, and so forth. The impact of a problem arises from business considerations, such as the finan-
    • 3.9 Risk-Based Testing 77cial loss the business will suffer from a problem, the number of users or custom-ers affected by a problem, and so forth. In risk-based testing, we use the risk items identified during risk analysistogether with the level of risk associated with each risk item to guide us. In fact,under a true analytical risk-based testing strategy, risk is the primary basis oftesting.Risk can guide testing in various ways, but there are three very common ones:■ First, during all test activities, test managers, technical test analysts, and test analysts allocate effort for each quality risk item proportionally to the level of risk. Technical test analysts and test analysts select test techniques in a way that matches the rigor and extensiveness of the technique with the level of risk. Test managers, technical test analysts, and test analysts carry out test activities in risk order, addressing the most important quality risks first and only at the very end spending any time at all on less-important ones. Finally, test managers, technical test analysts, and test analysts work with the project team to ensure that the repair of defects is appropriate to the level of risk.■ Second, during test planning and test control, test managers provide both mitigation and contingency responses for all significant, identified project risks. The higher the level of risk, the more thoroughly that project risk is managed.■ Third, test managers, technical test analysts, and test analysts report test results and project status in terms of residual risks. For example, which tests have not yet been run or have been skipped? Which tests have been run? Which have passed? Which have failed? Which defects have not yet been fixed or retested? How do the tests and defects relate back to the risks?When following a true analytical risk-based testing strategy, it’s important thatrisk management not happen only once in a project. The three responses to riskwe just covered—along with any others that might be needed—should occurthroughout the lifecycle. Specifically, we should try to reduce quality risk byrunning tests and finding defects and reduce project risks through mitigationand, if necessary, contingency actions. Periodically in the project we shouldreevaluate risk and risk levels based on new information. This might result inour reprioritizing tests and defects, reallocating test effort, and taking other testcontrol activities.
    • 78 3 Test Management ISTQB Glossary test control: A test management task that deals with developing and apply- ing a set of corrective actions to get a test project on track when monitoring shows a deviation from what was planned. See also test management. risk level: The importance of a risk as defined by its characteristics, impact and likelihood. The level of risk can be used to determine the intensity of testing to be performed. A risk level can be expressed either qualitatively (e.g., high, medium, low) or quantitatively. One metaphor sometimes used to help people understand risk-based testing is that testing is a form of insurance. In your daily life, you buy insurance when you are worried about some potential risk. You don’t buy insurance for risks that you are not worried about. So, we should test the areas that are worrisome, test for bugs that are worrisome, and ignore the areas and bugs that aren’t worri- some. One potentially misleading aspect of this metaphor is that insurance profes- sionals and actuaries can use statistically valid data for quantitative risk analysis. Typically, risk-based testing relies on qualitative analyses because we don’t have the same kind of data insurance companies have. During risk-based testing, you have to remain aware of many possible sources of risks. There are safety risks for some systems. There are business and economic risks for most systems. There are privacy and data security risks for many systems. There are technical, organizational, and political risks too. 3.9.1 Risk Management Risk management includes three primary activities: ■ Risk identification, figuring out what the different project and quality risks are for the project ■ Risk analysis, assessing the level of risk—typically based on likelihood and impact—for each identified risk item ■ Risk mitigation, which is really more properly called “risk control” because it consists of mitigation, contingency, transference, and acceptance actions for various risks
    • 3.9 Risk-Based Testing 79 ISTQB Glossary risk management: Systematic application of procedures and practices to the tasks of identifying, analyzing, prioritizing, and controlling risk. risk identification: The process of identifying risks using techniques such as brainstorming, checklists, and failure history. risk analysis: The process of assessing identified risks to estimate their impact and probability of occurrence (likelihood). risk mitigation or risk control: The process through which decisions are reached and protective measures are implemented for reducing risks to, or maintaining risks within, specified levels.In some sense, these activities are sequential, at least in terms of when they start.They are staged such that risk identification starts first. Risk analysis comes next.Risk control starts once risk analysis has determined the level of risk. However,since risk management should be continuous in a project, the reality is that riskidentification, risk analysis, and risk control are all recurring activities. Everyone has their own perspective on how to manage risks on a project,including what the risks are, the level of risk, and the appropriate controls to putin place for risks. Therefore, risk management should include all project stake-holders. In many cases, though, not all stakeholders can participate or are willing todo so. In such cases, some stakeholders may act as surrogates for other stake-holders. For example, in mass-market software development, the marketingteam might ask a small sample of potential customers to help identify potentialdefects that would affect their use of the software most heavily. In this case, thesample of potential customers serves as a surrogate for the entire eventual cus-tomer base. As another example, business analysts on IT projects can some-times represent the users rather than involving users in potentially distressingrisk analysis sessions that include conversations about what could go wrong andhow bad it would be. Technical test analysts bring particular expertise to risk management due totheir defect-focused outlook, especially as relates to technically based sources ofrisk and likelihood. So they should participate whenever possible. In fact, in
    • 80 3 Test Management many cases, the test manager will lead the quality risk analysis effort, with tech- nical test analysts providing key support in the process. With that overview of risk management in place, let’s look at the three risk management activities more closely. 3.9.2 Risk Identification For proper risk-based testing, we need to identify both product and project risks. We can identify both kinds of risks using techniques like these: ■ Expert interviews ■ Independent assessments ■ Use of risk templates ■ Project retrospectives ■ Risk workshops and brainstorming ■ Checklists ■ Calling on past experience Conceivably, you can use a single integrated process to identify both project and product risks. We usually separate them into two separate processes since they have two separate deliverables. We include the project risk identification pro- cess in the test planning process and thus hand the bulk of the responsibility for these kinds of risks to managers, including test managers. In parallel, the quality risk identification process occurs early in the project. That said, project risks—and not just for testing but also for the project as a whole—are often identified as by-products of quality risk analysis. In addition, if you use a requirements specification, design specification, use cases, or other documentation as inputs into your quality risk analysis process, you should expect to find defects in those documents as another set of by-products. These are valuable by-products, which you should plan to capture and escalate to the proper person. Previously, we encouraged you to include representatives of all possible stakeholder groups in the risk management process. For the risk identification activities, the broadest range of stakeholders will yield the most complete, accu- rate, and precise risk identification. The more stakeholder group representatives you omit from the process, the more you will miss risk items and even whole risk categories.
    • 3.9 Risk-Based Testing 81 ISTQB Glossary Failure Mode and Effect Analysis (FMEA): A systematic approach to risk identification and analysis of possible modes of failure and attempting to pre- vent their occurrence. See also Failure Mode, Effect and Criticality Analysis (FMECA). Failure Mode, Effect and Criticality Analysis (FMECA): An extension of FMEA. In addition to the basic FMEA, it includes a criticality analysis, which is used to chart the probability of failure modes against the severity of their con- sequences. The result highlights failure modes with relatively high probability and severity of consequences, allowing remedial effort to be directed where it will produce the greatest value. See also Failure Mode and Effect Analysis (FMEA).How far should you take this process? Well, it depends on the technique. In the Failure Mode and Effectrelatively informal Pragmatic Risk Analysis and Management technique that we Analysis (FMEA)frequently use, risk identification stops at the risk items. You have to be specific Failure Mode, Effect and Criticality Analysisenough about the risk items to allow for analysis and assessment of each risk item (FMECA)to yield an unambiguous likelihood rating and an unambiguous impact rating. Techniques that are more formal often look downstream to identify poten-tial effects of the risk item if it were to become an actual negative outcome.These effects include effects on the system—or the system of systems if applica-ble—as well as effects on the potential users, customers, stakeholders, and evensociety in general. Failure Mode and Effect Analysis is an example of such a for-mal risk management technique, and it is commonly used on safety-critical andembedded systems. Other formal techniques look upstream to identify the source of the risk.Hazard Analysis is an example of such a formal risk management technique.We’ve never used it ourselves, but Rex has talked to clients who have used it forsafety-critical medical systems.11. For more information on risk identification and analysis techniques, you can see either of RexBlack’s two books, Managing the Testing Process, 3e (for more both formal and informal tech-niques) or Pragmatic Software Testing (for informal techniques). If you want to use Failure Modeand Effect Analysis, then we recommend reading D.H. Stamatis’s Failure Mode and Effect Analysisfor a thorough discussion of the technique, followed by Managing the Testing Process, 3e for a dis-cussion of how the technique applies to software testing.
    • 82 3 Test Management 3.9.3 Risk Analysis or Risk Assessment The next step in the risk management process is referred to in the Advanced syl- labus as risk analysis. We prefer to call it risk assessment because analysis would seem to us to include both identification and assessment of risk. For example, the process of identifying risk items often includes analysis of work products such as requirements and metrics such as defects found in past projects. Regardless of what we call it, risk analysis or risk assessment involves the study of the identified risks. We typically want to categorize each risk item appropri- ately and assign each risk item an appropriate level of risk. We can use ISO 9126 or other quality categories to organize the risk items. In our opinion—and in the Pragmatic Risk Analysis and Management process described here—it doesn’t matter so much what category a risk item goes into, usually, so long as we don’t forget it. However, in complex projects and for large organizations, the category of risk can determine who has to deal with the risk. A practical implication of categorization like this will make the categorization important. The other part of risk assessment is determining the level of risk. This often involves likelihood and impact as the two key factors. Likelihood arises from technical considerations, typically, while impact arises from business consider- ations. So what technical factors should we consider when assessing likelihood? Here’s a list to get you started: ■ Complexity of technology and teams ■ Personnel and training issues ■ Intrateam and interteam conflict ■ Supplier and vendor contractual problems ■ Geographical distribution of the development organization, as with outsourcing ■ Legacy or established designs and technologies versus new technologies and designs ■ The quality—or lack of quality—in the tools and technology used ■ Bad managerial or technical leadership ■ Time, resource, and management pressure, especially when financial penalties apply
    • 3.9 Risk-Based Testing 83■ Lack of earlier testing and quality assurance tasks in the lifecycle■ High rates of requirements, design, and code changes in the project■ High defect rates■ Complex interfacing and integration issues.What business factors should we consider when assessing impact? Here’s a listto get you started:■ The frequency of use of the affected feature■ Potential damage to image■ Loss of customers and business■ Potential financial, ecological, or social losses or liability■ Civil or criminal legal sanctions■ Loss of licenses, permits, and the like■ The lack of reasonable workaroundsWhen determining the level of risk, we can try to work quantitatively or qualita-tively. In quantitative risk analysis, we have numerical ratings for both likeli-hood and impact. Likelihood is a percentage, and impact is often a monetaryquantity. If we multiply the two values together, we can calculate the cost ofexposure, which is called—in the insurance business—the expected payout orexpected loss. While perhaps some day in the future of software engineering we can dothis routinely, typically we find that we have to determine the level of risk quali-tatively. Why? Because we don’t have statistically valid data on which to performquantitative quality risk analysis. So we can speak of likelihood being very high,high, medium, low, or very low, but we can’t say—at least, not in any meaningfulway—whether the likelihood is 90 percent, 75 percent, 50 percent, 25 percent,or 10 percent. This is not to say—by any means—that a qualitative approach should beseen as inferior or useless. In fact, given the data most of us have to work with,use of a quantitative approach is almost certainly inappropriate on mostprojects. The illusory precision of such techniques misleads the stakeholdersabout the extent to which you actually understand and can manage risk. Whatwe’ve found is that if we accept the limits of our data and apply appropriatequalitative quality risk management approaches, the results are not only per-fectly useful, but also indeed essential to a well-managed test process.
    • 84 3 Test Management In any case, unless your risk analysis is based on extensive and statistically valid risk data, it will reflect perceived likelihood and impact. In other words, personal perceptions and opinions held by the stakeholders will determine the level of risk. Again, there’s absolutely nothing wrong with this, and we don’t bring this up to condemn the technique at all. The key point is that project managers, programmers, users, business analysts, architects, and testers typi- cally have different perceptions and thus possibly different opinions on the level of risk for each risk item. By including all these perceptions, we distill the collec- tive wisdom of the team. However, we do have a strong possibility of disagreements between stake- holders. The risk analysis process should include some way of reaching consen- sus. In the worst case, if we cannot obtain consensus, we should be able to escalate the disagreement to some level of management to resolve. Otherwise, risk levels will be ambiguous and conflicted and thus not useful as a guide for risk mitigation activities—including testing. 3.9.4 Risk Mitigation or Risk Control Having identified and assessed risks, we now must control them. As we men- tioned earlier, the Advanced syllabus refers to this as risk mitigation, but that’s not right. Risk control is a better term. We actually have four main options for risk control: ■ Mitigation, where we take preventive measures to reduce the likelihood of the risk occurring and/or the impact of a risk should it occur ■ Contingency, where we have a plan or perhaps multiple plans to reduce the impact of a risk should it occur ■ Transference, where we get another party to accept the consequences of a risk should it occur ■ Finally, ignoring or accepting the risk and its consequences should it occur For any given risk item, selecting one or more of these options creates its own set of benefits and opportunities as well as costs and, potentially, additional risks associated. Analytical risk-based testing is focused on creating risk mitigation opportu- nities for the test team, including for technical test analysts, especially for qual-
    • 3.9 Risk-Based Testing 85ity risks. Risk-based testing mitigates quality risks via testing throughout theentire lifecycle. Let us mention that, in some cases, there are standards that can apply. We’vealready looked at one such standard, the United States Federal Aviation Admin-istration’s DO-178B. We’ll look at another one of those standards shortly. It’s important too that project risks be controlled. For technical test analysts,we’re particularly concerned with test-affecting project risks like the following:■ Test environment and tools readiness■ Test staff availability and qualification■ Low quality of inputs to testing■ Overly high rates of change for work products delivered to testing■ Lack of standards, rules, and techniques for the testing effort.While it’s usually the test manager’s job to make sure these risks are controlled,the lack of adequate controls in these areas will affect the technical test analyst. One idea discussed in the Foundation syllabus, a basic principle of testing,is the principle of early testing and quality assurance (QA). This principlestresses the preventive potential of testing. Preventive testing is part of analyti-cal risk-based testing. We should try to mitigate risk before test execution starts.This can entail early preparation of testware, pretesting test environments, pre-testing early versions of the product well before a test level starts, insisting ontougher entry criteria to testing, ensuring requirements for and designing fortestability, participating in reviews (including retrospectives for earlier projectactivities), participating in problem and change management, and monitoringof the project progress and quality.In preventive testing, we take quality risk control actions throughout the lifecy-cle. Technical test analysts should look for opportunities to control risk usingvarious techniques:■ Choosing an appropriate test design technique■ Reviews and inspections■ Reviews of test design■ An appropriate level of independence for the various levels of testing■ The use of the most experienced person on test tasks■ The strategies chosen for confirmation testing (retesting) and regression testing
    • 86 3 Test Management Preventive test strategies acknowledge that quality risks can and should be miti- gated by a broad range of activities, many of them not what we traditionally think of as testing. For example, if the requirements are not well written, per- haps we should institute reviews to improve their quality rather than relying on tests that we will run once the badly written requirements become a bad design and ultimately bad, buggy code. Of course, testing is not effective against all kinds of quality risks. In some cases, you can estimate the risk reduction effectiveness of testing in general and for specific test techniques for given risk items. There’s not much point in using testing to reduce risk where there is a low level of test effectiveness. For exam- ple, code maintainability issues related to poor commenting or use of unstruc- tured programming techniques will not tend to show up—at least, not initially—during testing. Once we get to test execution, we run tests to mitigate quality risks. Where testing finds defects, testers reduce risk by providing the awareness of defects and opportunities to deal with them before release. Where testing does not find defects, testing reduces risk by ensuring that under certain conditions the sys- tem operates correctly. Of course, running a test only demonstrates operation under certain conditions and does not constitute a proof of correctness under all possible conditions. We mentioned earlier that we use level of risk to prioritize tests in a risk- based strategy. This can work in a variety of ways, with two extremes, referred to as depth-first and breadth-first. In a depth-first approach, all of the highest- risk tests are run before any lower-risk tests, and tests are run in strict risk order. In a breadth-first approach, we select a sample of tests across all the identified risks using the level of risk to weight the selection while at the same time ensur- ing coverage of every risk at least once. As we run tests, we should measure and report our results in terms of resid- ual risk. The higher the test coverage in an area, the lower the residual risk. The fewer bugs we’ve found in an area, the lower the residual risk.2 Of course, in doing risk-based testing, if we only test based on our risk analysis, this can leave blind spots, so we need to use testing outside the predefined test procedures to 2. You can find examples of how to carry out risk-based test reporting in Rex Black’s book Critical Testing Processes and in the companion volume to this book, Advanced Software Testing: Volume 2, which addresses test management.
    • 3.9 Risk-Based Testing 87see if we have missed anything. We’ll talk about about how to accomplish suchtesting, using blended testing strategies, in chapter 4. If, during test execution, we need to reduce the time or effort spent on test-ing, we can use risk as a guide. If the residual risk is acceptable, we can curtailour tests. Notice that, in general, those tests not yet run are less important thanthose tests already run. If we do curtail further testing, that property of risk-based test execution serves to transfer the remaining risk onto the users, cus-tomers, help desk and technical support personnel, or operational staff.Suppose we do have time to continue test execution? In this case, we can adjustour risk analysis—and thus our testing—for further test cycles based on whatwe’ve learned from our current testing. First, we revise our risk analysis. Then,we reprioritize existing tests and possibly add new tests. What should we lookfor to decide whether to adjust our risk analysis? We can start with the followingmain factors:■ Totally new or very much changed product risks■ Unstable or defect-prone areas discovered during the testing■ Risks, especially regression risk, associated with fixed defects■ Discovery of unexpected bug clusters■ Discovery of business-critical areas that were missedSo, if you have time for new additional test cycles, consider revising your qualityrisk analysis first. You should also update the quality risk analysis at each projectmilestone.3.9.5 An Example of Risk Identification and Assessment ResultsIn figure 3-1, you see an example of a quality risk analysis document. It is a casestudy from an actual project. This document—and the approach we used—fol-lowed the Failure Mode and Effect Analysis (FMEA) approach.
    • 88 3 Test Management Figure 3–1 An example of a quality risk analysis document using FMEA As you can see, we start—at the left side of the table—with a specific function and then identify failure modes and their possible effects. Criticality is deter- mined based on the effects, as is the severity and priority. Possible causes are listed to enable bug prevention work during requirements, design, and imple- mentation. Next, we look at detection methods—those methods we expect to be applied for this project. The more likely the failure mode is to escape detection, the worse the detection number. We calculate a risk priority number based on the severity, priority, and detection numbers. Smaller numbers are worse. Sever- ity, priority, and detection each range from 1 to 5, so the risk priority number ranges from 1 to 125. This particular table shows the highest-level risk items only because Rex sorted it by risk priority number. For these risk items, we’d expect a lot of addi- tional detection and other recommended risk control actions. You can see that we have assigned some additional actions at this point but have not yet assigned the owners. During testing actions associated with a risk item, we’d expect that the num- ber of test cases, the amount of test data, and the degree of test coverage would
    • 3.9 Risk-Based Testing 89all increase as the risk increases. Notice that we can allow any test proceduresthat cover a risk item to inherit the level of risk from the risk item. That docu-ments the priority of the test procedure, based on the level of risk.3.9.6 Risk-Based Testing throughout the LifecycleWe’ve mentioned that, properly done, risk-based testing manages risk through-out the lifecycle. Let’s look at how that happens, based on our usual approach toa test project. During test planning, risk management comes first. We perform a qualityrisk analysis early in the project, ideally once the first draft of a requirementsspecification is available. From that quality risk analysis, we build an estimatefor negotiation with and, we hope, approval by project stakeholders and man-agement. Once the project stakeholders and management agree on the estimate, wecreate a plan for the testing. The plan assigns testing effort and sequences testsbased on the level of quality risk. It also plans for project risks that could affecttesting. During test control, we will periodically adjust the risk analysis throughoutthe project. That can lead to adding, increasing, or decreasing test coverage;removing, delaying, or accelerating the priority of tests; and other such activi-ties. During test analysis and design, we work with the test team to allocate testdevelopment and execution effort based on risk. Because we want to report testresults in terms of risk, we maintain traceability to the quality risks. During implementation and execution, we sequence the procedures basedon the level of risk. We ensure that the test team, including the test analysts andtechnical test analysts, uses exploratory testing and other reactive techniques todetect unanticipated high-risk areas. During the evaluation of exit criteria and reporting, we work with our testteam to measure test coverage against risk. When reporting test results (andthus release readiness), we talk not only in terms of test cases run and bugsfound, but also in terms of residual risk.
    • 90 3 Test Management 3.9.7 Risk-Aware Testing Standards As you saw in chapter 2, the United States Federal Aviation Administration’s DO-178B standard bases the extent of testing—measured in terms of white-box code coverage—on the potential impact of a failure. That makes DO-178B a risk-aware testing standard. Another interesting example of how risk management, including quality risk management, plays into the engineering of complex and/or safety-critical systems is found in the ISO/IEC standard 61508, which is mentioned in the Advanced syllabus. This standard applies to embedded software that controls systems with safety-related implications, as you can tell from its title, “Func- tional safety of electrical/electronic/programmable electronic safety-related sys- tems.” The standard is very much focused on risks. Risk analysis is required. It considers two primary factors for determing the level of risk, likelihood and impact. During a project, we are to reduce the residual level of risk to a tolerable level, specifically through the application of electrical, electronic, or software improvements to the system. The standard has an inherent philosophy about risk. It acknowledges that we can’t attain a level of zero risk—whether for an entire system or even for a single risk item. It says that we have to build quality, especially safety, in from the beginning, not try to add it at the end. Thus we must take defect-preventing actions like requirements, design, and code reviews. The standard also insists that we know what constitutes tolerable and intol- erable risks and that we take steps to reduce intolerable risks. When those steps are testing steps, we must document them, including a software safety valida- tion plan, a software test specification, software test results, software safety vali- dation, a verification report, and a software functional safety report. The standard addresses the author-bias problem. As discussed in the Foun- dation syllabus, this is the problem with self-testing, the fact that you bring the same blind spots and bad assumptions to testing your own work that you brought to creating that work. So the standard calls for tester independence, indeed insisting on it for those performing any safety-related tests. And since testing is most effective when the system is written to be testable, that’s also a requirement.
    • 3.9 Risk-Based Testing 91 The standard has a concept of a safety integrity level (or SIL), which isbased on the likelihood of failure for a particular component or subsystem. Thesafety integrity level influences a number of risk-related decisions, including thechoice of testing and QA techniques. Some of the techniques are ones we’ll cover in this book and in the compan-ion volume for test analysis (volume I) that address various functional andblack-box testing design techniques. Many of the techniques are ones that we’llcover in this book, including probabilistic testing, dynamic analysis, datarecording and analysis, performance testing, interface testing, static analysis,and complexity metrics. Additionally, since thorough coverage, including dur-ing regression testing, is important in reducing the likelihood of missed bugs,the standard mandates the use of applicable automated test tools, which we’llalso cover here in this book. Again, depending on the safety integrity level, the standard might requirevarious levels of testing. These levels include module testing, integration testing,hardware-software integration testing, safety requirements testing, and systemtesting. If a level of testing is required, the standard states that it should be docu-mented and independently verified. In other words, the standard can requireauditing or outside reviews of testing activities. In addition, continuing on withthe theme of “guarding the guards,” the standard also requires reviews for testcases, test procedures, and test results along with verification of data integrityunder test conditions. The standard requires the use of structural test design techniques. Struc-tural coverage requirements are implied, again based on the safety integritylevel. (This is similar to DO-178B.) Because the desire is to have high confi-dence in the safety-critical aspects of the system, the standard requires com-plete requirements coverage not once, but multiple times, at multiple levels oftesting. Again, the level of test coverage required depends on the safety integ-rity level. Now, this might seem a bit excessive, especially if you come from a veryinformal world. However, the next time you step between two pieces of metalthat can move—e.g., elevator doors—ask yourself how much risk you want toremain in the software the controls that movement.
    • 92 3 Test Management 3.9.8 Risk-Based Testing Exercise 1 Read the HELLOCARM system requirements document in Appendix B, a hypothetical project derived from a real project that RBCS helped to test. If you’re working in a classroom, break into groups of three to five. If you are working alone, you’ll need to do this yourself. Perform quality risks analysis for this project. Since this is a book focused on non-functional testing, identify and assess risks for efficiency quality characteristics only. Use the templates shown in figure 3-2 and table 3-1 on the following page. To keep the time spent reasonable, I suggest 30 to 45 minutes identifying quality risks, then 15 to 30 minutes altogether assessing the level of each risk. If you are working in a classroom, once each group has finished its analysis, dis- cuss the results. Quality risks are potential system problems which could reduce user satisfaction Risk priority number. Aggregate measure of problem risk Tracing information back Impact of the problem: Arises from business, to requirements, operational. and regulatory considerations design, or other risk bases Likelihood of the problem: Arises from implementation and technical considerations Extent of Quality Risk Likelihood Impact Risk Pri. # Testing Tracing Risk Category 1 Risk 1 Risk 2 Risk n A heirarchy of 1 = Very high risk categories 2 = High can help The product of 3 = Medium organize the technical and 1-5 = Extensive 4 = Low list and jog business risk, 6-10 = Broad 5 = Very low your memory from 1-25 11-5 = Cursory 16-20 = Opportunity 21-25 = Report bugs Figure 3–2 Annotated template for informal quality risk analysis
    • 3.9 Risk-Based Testing 93Table 3–1 Functional quality risk analysis template showing quality risk categories to address Likeli- Im- Risk Extent of No. Quality Risk hood pact Pri. # Testing Tracing 1.1.000 Efficiency: Time behavior 1.1.001 [Non-functional risks related to time behavior go in this section.] 1.1.002 1.1.003 1.1.004 1.1.005 1.2.000 Efficiency: Resource utilization 1.2.001 [Non-functional risks related to accuracy go in this section.] 1.2.002 1.2.0033.9.9 Risk-Based Testing Exercise Debrief 1You can see our solution to the exercise starting in table 3-2. Immediately afterthat table are two lists of by-products. One is the list of project risks discoveredduring the analysis. The other is the list of requirements document defects dis-covered during the analysis. As a first pass for this quality risk analysis, Jamie went through each effi-ciency requirement and identified one or more risk items for it. Jamie assumedthat the priority of the requirement was a good surrogate for the impact of therisk, so he used that. Even using all of these shortcuts, it took Jamie about anhour to get through this.Table 3–2 Functional quality risk analysis for HELLOCARMS Likeli- Im- Risk Extent of No. Quality Risk hood pact Pri. # Testing Tracing 1.1.000 Efficiency: Time behavior 1.1.001 Screen-to-screen response 4 2 8 Broad 040-010-010 is continually slow for home equity loans 1.1.002 Screen-to-screen response 4 3 12 Cursory 040-010-010 is continually slow for home equity lines of credit 1.1.003 Screen-to-screen response 4 4 16 Opportun 040-010-010 is continually slow for reverse ity mortgages
    • 94 3 Test Management 1.1.004 Approval or decline of loan 4 2 8 Broad 040-010-020 continually slower than required for all applications 1.1.005 Approval or decline of high- 3 2 6 Broad 040-010-020 value loans slower than required 1.1.006 Loans regularly fail to be 3 3 9 Broad 040-010-030 processed within one hour 1.1.007 Time overhead on scoring 3 4 12 Cursory 040-010-040 mainframe does not meet requirements when running at 2,000 applications per hour. 1.1.008 Time overhead on scoring 2 4 8 Extensive 040-010-040 mainframe does not meet requirements when running at 4,000 applications per hour 1.1.009 Credit-worthiness of 3 2 6 Broad 040-010-050 customers is not determined in timely manner, especially when working at high rates of throughput 1.1.010 Fewer than rated Credit 2 2 4 Extensive 040-010-050 Bureau Mainframe requests are completed within the required timeframe at high rates of throughput 1.1.011 At high rates of throughput, 4 2 8 Broad 040-010-060 archiving application is slow, tying up Telephone Banker workstation appreciably 1.1.012 Escalation to Senior Banker 4 3 12 Cursory 040-010-070 or return to standard 040-010-080 Telephone Banker takes longer than rated time when working at high throughput. 1.1.013 Conditional confirmation of 3 2 6 Broad 040-010-090 acceptance takes appreciably more time at high throughput (2,000 per hour) 1.1.014 Conditional confirmation of 2 2 4 Extensive 040-010-090 acceptance takes appreciably more time at high throughput (4,000 per hour)
    • 3.9 Risk-Based Testing 95 1.1.015 Abort function does not reset 5 4 20 Report 040-010-100 the system to the starting Bugs point on the bankers workstation in preparation for the next customer in a timely manner 1.1.016 Unable to support Internet 5 4 20 Opportun 010-010-170 operations within a timely ity manner 1.1.017 Unable to sustain a 2,000 2 2 4 Extensive 040-010-110 application per hour throughput 1.1.018 Unable to sustain a 4,000 3 3 9 Broad 040-010-120 application per hour throughput 1.1.019 Unable to support 4,000 3 4 12 Cursory 040-010-130 concurrent applications 1.1.020 Unable to sustain a rate of 3 2 6 Broad 040-010-140 throughput that would facilitate 1.2 applications the first year 1.1.021 Turnaround time is 4 4 16 Opportun 040-010-150 prohibitive on transactions ity (including network transmission time) 1.2.000 Efficiency: Resource utilization 1.2.001 Database server unable to 4 3 12 Cursory 040-020-010 handle the rated load 1.2.002 Web server unable to handle 3 3 9 Broad 040-020-020 the rated load 1.2.003 App server unable to handle 2 3 6 Broad 040-020-030 the rated load3.9.10 Project Risk By-ProductsIn the course of preparing the quality risk analysis document, Jamie observedthe following project risk inherent in the requirements. What is Service level agreement (SLA) contract with the Credit Bureau Mainframe? Many efficiency measurements will be dependent on fast turnaround there.3.9.11 Requirements Defect By-ProductsIn the course of preparing the quality risk analysis document, Jamie observedthe following defect in the requirements.
    • 96 3 Test Management 1. Assumption appears to be that a certain percentage of applications will be approved or conditionally approved (see 040-010-140, 040-010-150, and 040-010-160). I see no proof that this is a logical necessity. 3.9.12 Risk-Based Testing Exercise 2 Using the quality risks analysis for HELLOCARMS, outline a set of non-func- tional test suites to run for HELLOCARM system integration and system test- ing. Specifically, the system integration test level should have, as an overall objective, looking for defects in and building confidence around the ability of the HELLOCARMS application to work with the other applications in the data- center efficiently. The system test level should have, as an overall objective, looking for defects in and building confidence around the ability of the HELLOCARMS application to provide the necessary end-to-end capabilities efficiently. Again, if you are working in a classroom, once each group has finished its work on the test suites and guidelines, discuss the results. 3.9.13 Risk-Based Testing Exercise Debrief 2 Based on the quality risk analysis Jamie performed, he created the following lists of system integration test suites and system test suites. System Integration Test Suites ■ HELLOCARMS/Scoring Mainframe Interfaces ■ HELLOCARMS/LoDoPS Interfaces ■ HELLOCARMS/GLADS Interfaces ■ HELLOCARMS/GloboRainBQW Interfaces System Test Suites ■ Time behavior / Home equity loans ■ Time behavior / Home equity lines of credit ■ Time behavior / Reverse mortgages ■ Resource utilization / Mix of all three
    • 3.10 Failure Mode and Effects Analysis 97 ISTQB Glossary session-based test management: A method for measuring and managing session-based testing, e.g., exploratory testing. session-based test mana- gement3.9.14 Test Case Sequencing GuidelinesFor system integration testing, we are going to create a suite for each of the dif-ferent systems that interact with HELLOCARMS. The suite will consist of allthe transactions that are possible between the systems. For system testing, we may be using overkill between the time behaviorsuites. It is unclear to us at this time in the software development lifecycle howmuch commonality there may be between the three different products. By thetime we get ready to test, we may merge those suites to test all loan productstogether. For resource utilization, we definitely will mix the different productstogether to get a realistic test set. The question of sequencing the tests should be addressed. In performancetesting, prioritization must include not only the risk analysis, but also the avail-ability of data, functionality, tools, and resources. Often the resources are prob-lematic because we will be including network, database, and applicationexperts, and perhaps others.3.10 Failure Mode and Effects Analysis Learning objectives Recall of content onlyThe concepts in this section apply primarily for test managers. There are nolearning objectives defined for technical test analysts in this section. In thecourse of studying for the exam, read this section in chapter 3 of the Advancedsyllabus for general recall and familiarity only.
    • 98 3 Test Management 3.11 Test Management Issues Learning objectives Recall of content only The concepts in this section apply primarily for test managers. There are no learning objectives defined for technical test analysts in this section. In the course of studying for the exam, read this section in chapter 3 of the Advanced syllabus for general recall and familiarity only. 3.12 Sample Exam Questions To end each chapter, you can try one or more sample exam questions to rein- force your knowledge and understanding of the material and to prepare for the ISTQB Advanced Level Technical Test Analyst exam. The questions in this section illustrate what is called a scenario question. Scenario Assume you are testing a computer-controlled braking system for an auto- mobile. This system includes the possibility of remote activation to initiate a gradual braking followed by disabling of the motor upon a full stop if the owner or the police report that the automobile is stolen or otherwise being illegally operated. Project documents and the product marketing collateral refer to this feature as emergency vehicle control override. The marketing team is heavily pro- moting this feature in advertisements as a major innovation for an automobile at this price. Consider the following two statements: I. Testing will cover the possibility of the failure of the emergency vehicle control override feature to engage properly and also the possibility of the emergency vehicle control override engaging without proper autho- rization. II. The reliability tests will include sending a large volume of invalid com- mands to the emergency vehicle control override system. Ten percent of these invalid commands will consist of deliberately engineered
    • 3.12 Sample Exam Questions 99 invalid commands that cover all invalid equivalence partitions and/or boundary values that apply to command fields; 10 percent of these invalid commands will consist of deliberately engineered invalid com- mands that cover all pairs of command sequences, both valid and invalid; the remaining invalid commands will be random corruptions of valid commands rendered invalid by the failure to match the check- sum.1. If the project follows the IEEE 829 documentation standard, which of the following statements about the IEEE 829 templates could be correct? A I belongs in the Test Design Specification, while II belongs in the Test Plan B I belongs in the Test Plan, while II belongs in the Test Design Specification C I belongs in the Test Case Specification, while II belongs in the Test Plan D I belongs in the Test Design Specification, while II belongs in the Test Item Transmittal Report2. If the project is following a risk-based testing strategy, which of the follow- ing is a quality risk item that would result in the kind of testing specified in statements I and II above? A The emergency vehicle control override system fails to accept valid commands. B The emergency vehicle control override system is too difficult to install. C The emergency vehicle control override system accepts invalid commands. D The emergency vehicle control override system alerts unauthorized drivers.3. Assume that each quality risk item is assessed for likelihood and impact to determine the extent of testing to be performed. Consider only the infor- mation in the scenario, in questions 1 and 2 above, and in your answers to
    • 100 3 Test Management those questions. Which of the following statements is supported by this information? A The likelihood and impact are both high. B The likelihood is high. C The impact is high. D No conclusion can be reached about likelihood or impact.
    • 1014 Test Techniques “Let’s just say I was testing the bounds of reality. I was curious to see what would happen. That’s all it was: curiosity.” Jim Morrison of the Doors, an early boundary analystThe fourth chapter of the Advanced syllabus is concerned with test techniques.To make the material manageable, the syllabus uses a taxonomy—a hierarchicalclassification scheme—to divide the material into sections and subsections.Conveniently for us, it uses the same taxonomy of test techniques given in theFoundation syllabus, namely specification-based, structure-based, and experi-ence-based, adding additional techniques and further explanation in each cate-gory. In addition, the Advanced syllabus also discusses both static and dynamicanalysis.1This chapter contains six sections.1. Introduction2. Specification-Based3. Structure-Based4. Defect- and Experience-Based5. Static Analysis6. Dynamic AnalysisWe’ll look at each section and how it relates to test analysis.1. You might notice a slight change from the organization of the Foundation syllabus here.The Foundation syllabus covers static analysis in the material on static techniques in chapter 3.The Advanced syllabus covers reviews (one form of static techniques) in chapter 6. The Advancedsyllabus covers static analysis (another form of static techniques) and dynamic analysis in chapter4 and, in terms of tools, in chapter 9.
    • 102 4 Test Techniques 4.1 Introduction Learning objectives Recall of content only In this chapter and the next two chapters, we cover a number of test design techniques. Let’s start by putting some structure around these techniques, and then we’ll tell you which ones are in scope for the test analyst and which are in scope for the technical test analyst. Testing Static Dynamic Static Review Defect- analysis Experience- Black-box White-box based based ATA ATTA ATTA ATA ATA ATTA ATTA Non- ATTA Dynamic Functional functional analysis ATA ATTA ATA ATTA ATTA Figure 4–1 A taxonomy for Advanced syllabus test techniques Figure 4-1 shows the highest-level breakdown of testing techniques as the dis- tinction between static and dynamic tests. Static tests, you’ll remember from the Foundation syllabus, are those that do not involve running (or executing) the test object itself, while dynamic tests are those that do involve running the test object. Static tests are broken down into reviews and static analysis. A review is any method where the human being is the primary defect finder and scrutinizer of the item under test, while static analysis relies on a tool as the primary defect finder and scrutinizer. Dynamic tests are broken down into five main types in the Advanced level: ■ Black-box (also referred to as specification-based or behavioral), where we test based on the way the system is supposed to work. Black-box tests can
    • 4.1 Introduction 103 further be broken down in two main subtypes, functional and non- functional, following the ISO 9126 standard. The easy way to distinguish between functional and non-functional is that functional is testing what the system does and non-functional is testing how or how well the system does what it does.■ White-box (also referred to as structural), where we test based on the way the system is built.■ Experience-based, where we test based on our skills and intuition, along with our experience with similar applications or technologies.■ Defect-based, where we use our understanding of the type of defect targeted by a test as the basis for test design, with tests derived systemati- cally from what is known about the defect.■ Dynamic analysis, where we analyze an application while it is running, usually via some kind of instrumentation in the code.There’s always a temptation, when presented with a taxonomy—a hierarchicalclassification scheme—like this one, to try to put things into neat, exclusive cat-egories. If you’re familiar with biology, you’ll remember that every living being,from herpes virus to trees to chimpanzees, has a unique Latin name based onwhere in the big taxonomy of life it fits. However, when dealing with testing,remember that these techniques are complementary. You should use whicheverand as many as are appropriate for any given test activity, whatever level of test-ing you are doing. Now, in figure 4-1, below each bubble that shows the lowest-level break-down of test techniques, you see one or two boxes. Boxes with ATA in them rep-resents test techniques covered by the book Advanced Software Testing Vol. 1.Boxes with ATTA in them represent test techniques covered by this book. Thefact that most techniques are covered by both books does not, however, meanthat the coverage is the same. You should also keep in mind that this assignment of types of techniquesinto particular roles is based on common usage, which might not correspond toyour own organization. You might have the title of test analyst and be responsi-ble for dynamic analysis. You might have the title of technical test analyst andnot be responsible for static analysis at all.
    • 104 4 Test Techniques Okay, so that gives you an idea of where we’re headed, basically for the rest of this chapter. Figure 4-1 gives an overview of the heart of this book, probably 65 percent of the material covered. In this chapter, we cover dynamic testing, except for non-functional tests, which are covered in the next chapter. A subse- quent chapter covers static testing. You might want to make a copy of figure 4-1 and have it available as you go through this chapter and the next two, as a way of orienting yourself to where we are in the book. 4.2 Specification-Based Learning objectives (K2) List examples of typical defects to be identified by each specific specification-based technique. (K3) Write test cases from a given software model in real life using the following test design techniques (the tests shall achieve a given model coverage): – Equivalence partitioning – Boundary value analysis – Decision tables – State transition testing (K4) Analyze a system, or its requirement specification, in order to determine which specification-based techniques to apply for specific objectives, and outline a test specification based on IEEE 829, focusing on component and non-functional test cases and test procedures. Let’s start with a broad overview of specification-based tests before we dive into the details of each technique. In specification-based testing, we are going to derive and select tests by ana- lyzing the test basis. Remember that the test basis is—or, more accurately, the test bases are—the documents that tell us, directly or indirectly, how the com- ponent or system under test should and shouldn’t behave, what it is required to do, and how it is required to do it. These are the documents we can base the tests on. Hence the name, test basis. Note also that, while the ISTQB glossary
    • 4.2 Specification-Based 105 ISTQB Glossary black-box testing: Testing, either functional or non-functional, without refer- ence to the internal structure of the component or system. specification: A document that specifies, ideally in a complete, precise and ver- ifiable manner, the requirements, design, behavior, or other characteristics of a component or system and, often, the procedures for determining whether these provisions have been satisfied. specification-based technique (or black-box test design technique): Proce- dure to derive and/or select test cases based on an analysis of the specifica- tion, either functional or non-functional, of a component or system without reference to its internal structure. test basis: All documents from which the requirements of a component or sys- tem can be inferred. The documentation on which the test cases are based. If a document can be amended only by way of formal amendment procedure, then the test basis is called a frozen test basis. black-box testing specification-based tech- nique test basisdefines the test basis to consist of documents, it is not uncommon to base testson conversations with users, business analysts, architects, and other stake-holders or on reviews of competitors’ systems. It’s probably worth a quick compare-and-contrast with the term test oracle,which is similar and related but not the same as the test basis. The test oracle isanything we can use to determine expected results that we can compare with theactual results of the component or system under test. Anything that can serve asa test basis can also be a test oracle, of course. However, an oracle can also be anexisting system, either a legacy system being replaced or a competitor’s system.An oracle can also be someone’s specialized knowledge. An oracle should not bethe code because otherwise we are only testing whether the compiler works. For structural tests—which are covered in the next section—the internalstructure of the system is the test basis but not the test oracle. However, forspecification-based tests, we do not consider the internal structure at all—atleast theoretically. Beyond being focused on behavior rather than structure, what’s common inspecification-based test techniques? Well, for one thing, there is some model,whether formal or informal. The model can be a graph or a table. For each
    • 106 4 Test Techniques ISTQB Glossary test oracle: A source to determine expected results to compare with the actual result of the software under test. An oracle may be the existing system (for a benchmark), other software, a user manual, or an individual s specialized knowledge, but it should not be the code. model, there is some systematic way to derive or create tests using it. And, typi- cally, each technique has a set of coverage criteria that tell you, in effect, when test oracle the model has run out of interesting and useful test ideas. It’s important to remember that fulfilling coverage criteria for a particular test design technique does not mean that your tests are in any way complete or exhaustive. Instead, it means that the model has run out of useful tests to suggest based on that tech- nique. There is also typically some family of defects that the technique is particu- larly good at finding. Boris Beizer, in his books on test design, referred to this as the “bug hypothesis”. He meant that, if you hypothesize that a particular kind of bug is likely to exist, you could then select the technique based on that hypothe- sis. This provides an interesting linkage with the concept of defect-based test- ing, which we’ll cover in a later section of this chapter.2 Often, specification-based tests are requirements based. Requirements specifications often describe behavior, especially functional behavior. (The ten- dency to underspecify non-functional requirements creates a set of quality- related problems that we’ll not address at this point.) So, we can use the descrip- tion of the system behavior in the requirements to create the models. We can then derive tests from the models. 2. Boris Beizer’s books on this topic would include Black-Box Testing (perhaps most pertinent to test analysts), Software System Testing and Quality Assurance (good for test analysts, technical test analysts, and test managers), and Software Test Techniques (perhaps most pertinent to technical test analysts).
    • 4.2 Specification-Based 107 ISTQB Glossary requirements-based testing: An approach to testing in which test cases are designed based on test objectives and test conditions derived from require- ments, e.g., tests that exercise specific functions or probe non-functional attributes such as reliability or usability.4.2.1 Equivalence PartitioningWe start with the most basic of specification-based test design techniques,equivalence partitioning. Conceptually, equivalence partitioning involves test-ing various groups that are expected to be handled the same way by the systemand to exhibit similar behavior. The underlying model is a graphical or mathematical one that identifiesequivalent classes—which are also called equivalence partitions—of inputs, out-puts, internal values, time relationships, calculations, or just about anything elseof interest. These classes or partitions are called equivalent because they arelikely to be handled the same way by the system. Some of the classes can becalled valid equivalence classes because they describe valid situations that thesystem should handle normally. Other classes can be called invalid equivalenceclasses because they describe invalid situations that the system should reject orat least escalate to the user for correction or exception handling. Once we’ve identified the equivalence classes, we can derive tests from them.Usually, we are working with more than one set of equivalence classes at one time;for example, each input field on a screen has its own set of valid and invalid equiv-alence classes. So, we can create one set of valid tests by selecting one valid mem-ber from each equivalence partition. We continue this process until each validclass for each equivalence partition is represented in at least one valid test. Next, we can create negative tests (using invalid data). In most applications,for each equivalence partition, we select one invalid class for one equivalencepartition and a valid class for every other equivalence partition. This rule—don’t combine multiple invalid equivalent classes in a single test—prevents usfrom running into a situation where the presence of one invalid value mightmask the incorrect handling of another invalid value. We continue this processuntil each invalid class for each equivalence partition is represented in at leastone invalid test.
    • 108 4 Test Techniques ISTQB Glossary equivalence partition: A portion of an input or output domain for which the behavior of a component or system is assumed to be the same, based on the specification. equivalence partitioning: A black-box test design technique in which test cases are designed to execute representatives from equivalence partitions. In principle, test cases are designed to cover each partition at least once. However, some applications (particularly web-based applications) are now designed so that, when processing an input screen, they evaluate all of the inputs before responding to the user. With these screens, multiple invalid values are bundled up and returned to the user all at the same time, often by highlight- ing erroneous values or the controls that contain incorrect values. Careful selec- tion of data is still required, even though multiple invalid values may be tested concurrently if we can check that the input validation code acts independently on each field. Notice the coverage criterion implicit in the preceding discussion. Every class member, both valid and invalid, is represented in at least one test case. What is our bug hypothesis with this technique? For the most part, we are looking for a situation where some equivalence class is handled improperly. That could mean the value is accepted when it should have been rejected or vice versa, or that a value is properly accepted or rejected but handled in a way appropriate to another equivalence class, not the class to which it actually belongs. You might find this concept a bit confusing verbally, so let’s try some figures. Figure 4-2 shows a way that we can visualize equivalence partitioning. As you can see in the top half of the figure 4-2, we start with some set of interest. This set of interest can be an input field, an output field, a test precon- dition or postcondition, a configuration, or just about anything we’re interested in testing. The key is that we can apply the operation of equivalence partitioning to split the set into two or more disjoint subsets, where all the members of each subset share some trait in common that was not shared with the members of the other subset.
    • 4.2 Specification-Based 109 Subset A Equivalence Set Partitioning Subset B Subset A Subset A Select Test Cases Subset B Subset BFigure 4–2 Visualizing equivalence partitioningFor example, if you have a simple drawing program that can fill figures with red,green, or blue, you can split the set of fill colors into three disjoint sets: red,green, and blue. In the bottom half of the figure 4-2, we see the selection of test case valuesfrom the subsets. The dots in the subsets represent the value chosen from eachsubset to be represented in the test case. This involves selecting at least onemember from each subset. In pure equivalence partitioning, the logic behindthe specific selection is outside the scope of the technique. In other words, youcan select any member of the subset you please. If you’re thinking, “Some mem-bers are better than others,” that’s fine; hold that thought for a few minutes andwe’ll come back to it. Now, at this point we’ll generate the rest of the test case. If the set that wepartitioned was an input field, we might refer to the requirements specificationto understand how each subset is supposed to be handled. If the set that we par-titioned was an output field, we might refer to the requirements to derive inputs
    • 110 4 Test Techniques that should cause that output to occur. We might use other test techniques to design the rest of the test cases. Subset A 1 Equivalence Subset A Partitioning Subset A 2 Equivalence Set Partitioning Subset A 3 Subset B Subset A 1 Subset A 1 Subset A 2 Subset A 2 Select Test Cases Subset A 3 Subset A 3 Subset B Subset B Figure 4–3 Subpartitioning an equivalence class Figure 4-3 shows that equivalence partitioning can be applied iteratively. In this figure, we apply a second round of equivalence partitioning to one of the sub- sets to generate three smaller subsets. Only at that point do we select four mem- bers—one from subset B and one each from subset A1, A2, and A3—for test cases. Note that we don’t have to select a member from subset A since each of the members from subsets A1, A2, and A3 are also members of subset A. 4.2.1.1 Avoiding Equivalence Partitioning Errors While equivalence partitioning is fairly straightforward, people do make some common errors when applying it. Let’s look at these errors so you can avoid them. First, as shown in the top half of figure 4-4, the subsets must be disjoint. That is, no two of the subsets can have one or more member in common. The whole point of equivalence partitioning is to test whether a system handles dif- ferent situations differently (and properly, of course). If it’s ambiguous as to which handling is proper, then how do we define a test case around this? Try it out and see what happens? Not much of a test!
    • 4.2 Specification-Based 111 Subset A Set Equivalence Partitioning Subset B Subset A Set Equivalence Partitioning Subset B {}Figure 4–4 Common equivalence partitioning errorsSecond, as shown in the bottom half of figure 4-4, none of the subsets may beempty. That is, if the equivalence partitioning operation produces a subset withno members, that’s hardly very useful for testing. We can’t select a member ofthat subset because it has no members. Third, while not shown graphically—in part because we couldn’t figure outa clear way to draw the picture—note that the equivalence partitioning processdoes not subtract, it divides. What we mean by this is that, in terms of mathe-matical set theory, the union of the subsets produced by equivalence partition-ing must be the same as the original set that was equivalence partitioned. Inother words, equivalence partitioning does not generate “spare” subsets that aresomehow disposed of in the process—at least, not if we do it properly. Notice that this is important because, if this is not true, then we stand thechance of failing to test some important subset of inputs, outputs, configura-tions, or some other factor of interest that somehow was dropped in the testdesign process.4.2.1.2 Composing Test Cases with Equivalence PartitioningWhen we compose test cases in situations where we’ve performed equivalence par-titioning on more than one set, we select from each subset as shown in figure 4-5.
    • 112 4 Test Techniques TC1 TC2 TC3 X1 X1 Equivalence Select X Partitioning Test Cases X2 X2 Compose Test Cases Y1 Y1 Equivalence Select Y Partitioning Y2 Test Cases Y2 Y3 Y3 Figure 4–5 Composing test cases with valid values Here, we start with set X and set Y. We partition set X into two subsets, X1 and X2. We partition set Y into three subsets, Y1, Y2, and Y3. We select test case values from each of the five subsets, X1 and X2 on the one hand, Y1, Y2, and Y3 on the other. We then compose three test cases since we can combine the values from the X subsets with values from the Y subsets (assuming the values are independent and all valid). For example, imagine you are testing a browser-based application. You are interested in two browsers: Internet Explorer and Firefox. You are interested in three connection types: dial-up, DSL, and cable modem. Since the browser and the connection types are independent, we can create three test cases. In each of the test cases, one of the connection types and one of the browser types will be represented. One of the browser types will be represented twice. When discussing figure 4-5, we made a brief comment that we can combine values across the equivalence partitions when the values are independent and all valid. Of course, that’s not always the case. In some cases, values are not independent, in that the selection of one value from one subset constrains the choice in another subset. For example, imagine if we are trying to test combinations of applications and operating systems. We can’t test an application running on an operating system if there is not a version of that application available for that operating system. In some cases, values are not valid. For example, in figure 4-6, imagine that we are testing a project management application, something like Microsoft Project. Suppose that set X is the type of event we’re dealing with, which can be
    • 4.2 Specification-Based 113either a task (X1) or a milestone (X2). Suppose that set Y is the start date of theevent, which can be in the past (Y1), today (Y2), or in the future (Y3). Supposethat set Z is the end date of the event, which can be either on or after the startdate (Z1) or before the start date (Z2). Of course, Z2 is invalid, since no eventcan have a negative duration. TC1 TC2 TC3 TC4 X1 X1 Equivalence Select X Partitioning Test Cases X2 X2 Compose Test Cases Y1 Y1 Equivalence Select Y Partitioning Y2 Test Cases Y2 Y3 Y3 Compose Test Cases Z1 Z1 Equivalence Select Z Partitioning Test Cases Z2 Z2Figure 4–6 Composing tests with invalid valuesSo, we test combinations of tasks and milestones with past, present, and futurestart dates and valid end dates in test cases TC1, TC2, and TC3. In TC4, wecheck that illegal end dates are correctly rejected. We try to enter a task with astart date in the past and an end date prior to the start date. If we wanted to sub-partition the invalid situation, we could also test with a start date in the presentand one in the future together with an end date before the start date, whichwould add two more test cases. One would test a start date in the present and anend date prior to the start date. The other would test a start date in the futureand an end date prior to the start date. In this particular example, we had a single subset for just one of the sets thatwas invalid. The more general case is that many—perhaps all—of the fields willhave invalid subsets. Imagine testing an e-commerce application. On the check-out screens of the typical e-commerce application, there are multiple requiredfields, and usually there is at least one way for such a field to be invalid. When there are multiple invalid values, there are two possible ways oftesting them, depending on how the error handling is designed in the applica-tion.
    • 114 4 Test Techniques Historically, most error handling is done by parsing through the values and immediately stopping when the first error is found. Some newer applications parse through the entire set of values first, and then compile a list of errors found. Whichever error handler is used, we can always run test cases with a single invalid value entered. That way, we can check that every invalid value is correctly rejected or otherwise handled by the system. Of course, the number of test cases required by this method will equal the number of invalid parti- tions. When we have an error handler that aggregates all of the errors in one pass, we could test it by submitting multiple invalid values and thence reduce the total number of test cases. In a safety-critical or mission-critical system, you might want to test combi- nations of invalid values even if the parser stops on the first error. Just remem- ber, anytime you start down the trail of combinatorial testing, you are taking a chance that you’ll spend a lot of time testing things that aren’t terribly impor- tant. Consider a simple example of equivalence partitioning for system configu- ration. On the Internet appliance project we’ve mentioned, there were four pos- sible configurations for the appliances. They could be configured for kids, teens, adults, or seniors. This configuration value was stored in a database on the Internet service provider’s server so that, when an Internet appliance is con- nected to the Internet, this configuration value became a property of its connec- tion. Based on this configuration value, there were two key areas of difference in the expected behavior. For one thing, for the kids and teens systems, there was a filtering function enabled. This determined the allowed and disallowed websites the system could surf to. The setting was most strict for kids and somewhat less strict for teens. Adults and seniors, of course, were to have no filtering at all and should be able to surf anywhere. For another thing, each of the four configurations had a default set of e-commerce sites they could visit called the mall. These sites were selected by the marketing team and were meant to be age appropriate. Of course, these were the expected differences. We were also aware of the fact that there could be weird unexpected differences that could arise because
    • 4.2 Specification-Based 115that’s the nature of some types of bugs. For example, performance was supposedto be the same, but it’s possible that performance problems with the filteringsoftware could introduce perceptible response-time issues with the kids andteens systems. We had to watch for those kinds of misbehaviors. So, to test for the unexpected differences, we simply had at least one of eachconfiguration in the test lab at all times and spread the nonconfiguration-spe-cific tests more or less randomly across the different configurations. To testexpected differences related to filtering and e-commerce, we made sure theseconfiguration-specific tests were run on the correct configuration. The chal-lenge here was that, while the expected results were concrete for the mall—themarketing people gave us four sets of specific sites, one for each configuration—the expected results for the filtering were not concrete but rather logical. Thisled to an enormous number of disturbing defect reports during test execution aswe found creative ways to sneak around the filters and access age-inappropriatematerials on the kids and teens configurations.4.2.1.3 Equivalence Partitioning ExerciseA screen prototype for one screen of the HELLOCARMS system is shown infigure 4-7. This screen asks for three pieces of information:■ The product being applied for, which is one of the following: – Home equity loan – Home equity line of credit – Reverse mortgage■ Whether someone has an existing Globobank checking account, which is either Yes or No■ Whether someone has an existing Globobank savings account, which is either Yes or NoIf the user indicates an existing Globobank account, then the user must enterthe corresponding account number. This number is validated against the bank’scentral database upon entry. If the user indicates no such account, the user mustleave the corresponding account number field blank. If the fields are valid, including the account number fields, then thescreen will be accepted. If one or more fields are invalid, an error message isdisplayed.
    • 116 4 Test Techniques Apply for Product? {Select a product} Home equity loan Home equity line of credit Reverse mortgage Existing Checking? {Select Yes or No} {If Yes input account number} Yes No Existing Savings? {Select Yes or No} {If Yes input account number} Yes No Figure 4–7 HELLOCARMS system product screen prototype The exercise consists of two parts: 1. Show the equivalence partitions for each of the three pieces of information, indicating valid and invalid members. 2. Create test cases to cover these partitions, keeping in mind the rules about combinations of valid and invalid members. The answers to the two parts are shown on the next pages. You should review the answer to the first part (and, if necessary, revise your answer to the second part) before reviewing the answer to the second part. 4.2.1.4 Equivalence Partitioning Exercise Debrief First, let’s take a look at the equivalence partitions. For the application-product field, the equivalence partitions are as follows: Table 4–1 # Partition 1 Home equity loan 2 Home equity line of credit 3 Reverse mortgage
    • 4.2 Specification-Based 117Note that the screen prototype shows this information as selected from a pull-down list, so there is no possibility of entering an invalid product here. Somepeople do include an invalid product test, but we have not. For each of two existing-account entries, the situation is best modeled as asingle input field, which consists of two subfields. The first subfield is the Yes/No field. This subfield determines the rule for checking the second subfield,which is the account number. If the first subfield is Yes, the second subfieldmust be a valid account number. If the first subfield is No, the second subfieldmust be blank.So, the existing checking account information partitions are as follows:Table 4–2 # Partition 1 Yes-Valid 2 Yes-Invalid 3 No-Blank 4 No-NonblankAnd, the existing savings account information partitions are as follows:Table 4–3 # Partition 1 Yes-Valid 2 Yes-Invalid 3 No-Blank 4 No-NonblankNote that, for both of these, partitions 2 and 4 are invalid partitions, while parti-tions 1 and 3 are valid partitions. Just as a side note, we could simplify the interface to eliminate some errorhandling code, testing, and possible end-user errors. This simplification couldbe done by designing the interface such that selecting No for either existingchecking or savings accounts automatically deleted any value in the respec-tive account number text field and then hid that field. In both table 4-2 andtable 4-3, this would have the effect of eliminating partition 4. The userwould not be allowed to make the mistakes, therefore the code would nothave to handle the exceptions, and therefore the tester would not need to test
    • 118 4 Test Techniques them. A win-win-win situation. Often, clever design of the interface can increase the testability of an application in this way. Now, let’s create tests from these equivalence partitions. As we do so, we’re going to capture traceability information from the test case number back to the partitions. Once we have a trace from each partition to a test case, we’re done— provided that we’re careful to follow the rules about combining valid and invalid partitions! Table 4–4 Inputs 1 2 3 4 5 6 7 Product HEL LOC RM HEL LOC RM HEL Existing Checking? Yes No No Yes No No No Checking Account Valid Blank Blank Invalid Nonblank Blank Blank Existing Savings? No Yes No No No Yes No Savings Account Blank Valid Blank Blank Blank Invalid Nonblank Outputs Accept? Yes Yes Yes No No No No Error? No No No Yes Yes Yes Yes Table 4–5 # Partition Test Case 1 Home equity loan (HEL) 1 2 Home equity line of credit (LOC) 2 3 Reverse mortgage (RM) 3 Table 4–6 # Partition Test Case 1 Yes-Valid 1 2 Yes-Invalid 4 3 No-Blank 2 4 No-Nonblank 5 Table 4–7 # Partition Test Case 1 Yes-Valid 2 2 Yes-Invalid 6 3 No-Blank 1 4 No-Nonblank 7
    • 4.2 Specification-Based 119 ISTQB Glossary boundary value: An input value or output value which is on the edge of an equivalence partition or at the smallest incremental distance on either side of an edge, such as, for example, the minimum or maximum value of a range. boundary value analysis: A black-box test design technique in which test cases are designed based on boundary values.You should notice that these test cases do not cover all interesting possiblecombinations of factors here. For example, we don’t test to make sure a per-son with both a valid savings and valid checking account work properly. Thatcould be an interesting test because the accounts might have been establishedat different times and might have information that now conflicts in some way;e.g., in some countries it is still relatively common for a woman to take herhusband’s last name upon marriage. We also don’t test the combination ofinvalid accounts or the combination of account numbers that are valid alonebut not valid together; e.g., the two accounts belong to entirely differentpeople.4.2.2 Boundary Value AnalysisLet’s refine our equivalence partitioning test design technique with the nexttechnique, boundary value analysis. Conceptually, boundary value analysis ispredominately about testing the edges of equivalence classes. In other words,instead of selecting one member of the class, we select the largest and smallestmembers of the class and test them. We will discuss some other options forboundary analysis later. The underlying model is again either a graphical or mathematical one thatidentifies two boundary values at the boundary between one equivalence classand another. (In some techniques, the model identifies three boundary values,which we’ll discuss later.) Whether such a boundary exists for subsets wherewe’ve performed equivalence partitioning is another question that we’ll get to injust a moment. Right now, assuming the boundary does exist, notice that theboundary values are special members of the equivalence classes that happen tobe right next to each other and right next to the point where the expectedbehavior of the system changes. If the boundary values are members of a valid
    • 120 4 Test Techniques equivalence class, they are valid, of course, but if members of an invalid equiva- lence class, they are invalid. Deriving tests with boundary values as the equivalence class members is much the same as for plain equivalence classes. We test valid boundary values together and then combine one invalid boundary value with other valid bound- ary values (unless the application being tested aggregates errors). We have to represent each boundary value in a test case, analogous to the equivalence partitioning situation. In other words, the coverage criterion is that every boundary value, both valid and invalid, must be represented in at least one test. The main difference is that there are at least two boundary values in each equivalence class. Therefore, we’ll have more test cases, about twice as many. More test cases? That’s not something we like unless there’s a good reason. What is the point of these extra test cases? That is revealed by the bug hypothe- sis for boundary value analysis. Since we are testing equivalence class mem- bers—every boundary value is an equivalence class member—we are testing for situations where some equivalence class is handled improperly. Again, that could mean acceptance of values that should be rejected, or vice versa, or proper acceptance or rejection but improper handling subsequently, as if the value were in another equivalence class, not the class to which it actually belongs. However, by testing boundary values, we also test whether the boundary between equiva- lence classes is defined in the right place. So, do all equivalence classes have boundary values? No, definitely not. Boundary value analysis is an extension of equivalence partitioning that applies only when the members of an equivalence class are ordered. So, what does that mean? Well, an ordered set is one where we can say that one member is greater than or less than some other member if those two mem- bers are not the same. We have to be able to say this meaningfully too. Just because some item is right above or below some other item on a pull-down menu does not mean that, within the program, the two items have a greater- than/less-than relationship. 4.2.2.1 Examples of Equivalence Partitioning and Boundary Values Let’s look at two examples where a technical test analyst can (and can’t) use boundary value analysis on equivalence classes: enumerations and arrays.
    • 4.2 Specification-Based 121 Many languages allow a programmer to define a set of named values thatcan be used as a substitute for constant values. These are called enumerations. As an example, assume that a database stored colors as integers to savespace. Red equals 0, blue equals 1, green equals 2. When the programmerwanted to set an object to red, he would have to use the value 0. These arbitrary values are often called magic numbers because they have a(somewhat) hidden meaning, often known only by the programmer. A pro-gramming language that allowed enumeration could allow the programmer tocreate a mapping of the magic numbers to a textual value, allowing the pro-grammer to write code that is self-documenting. Instead of writing ColorGun = 0;, the programmer can write ColorGun =Red. In figure 4-8, an example enumeration is made in the C language, definingthe five different colors. enum Colors { Red, Blue, Green, Yellow, Cyan };Figure 4–8In some languages, the programmer is given special functions to allow easiermanipulation of the enumerated values. For example, Pascal has the functionspred and succ to traverse the values of an enumeration. Of course, trying to“pred” the first or “succ” the last is going to fail, sometimes with unknownsymptoms. Hence boundaries that can be tested: both valid and invalid. The compiler usually manages the access to the values in the enumeration.We should be able to assume (we hope) that if we can access the first and lastelements of an enumeration, we should be able to access them all. If the lan-guage supports functions that walk the enumerated list, then we should testboth valid and invalid boundaries using the list. In some languages (e.g., C), the order of enumeration may be set by thecompiler but the rules used are specific to each compiler and therefore not con-sistent for testing. Knowing the rules for the programming language and com-piler being used is essential.
    • 122 4 Test Techniques Arrays are data structures allowing a programmer to store homogenous data and access it via index rather than by name. The data must be homoge- nous, which means of a single type. In most languages, arrays can consist of ele- ments of programmer-defined types. A defined type may contain any number of different internal elements (including a mix of different data types) but each element in the array must be of that type, whether that type is built in or pro- grammer defined. Some programming languages allow only static arrays3 which are never allowed to change in number of elements. Other programming languages allow arrays to be dynamically created (memory for the array is allocated off the heap at runtime). These types of arrays can often be reallocated to allow resizing them on the fly. That can make testing them interesting. Even more interesting is testing what happens after an array is resized. Then, we get issues as to whether pointers to the old array are still accessible, whether data get moved correctly (or lost if the array is resized downward). For languages that do bounds checking of arrays, where an attempt to access an item before or after the array is automatically prevented, testing of boundaries may be somewhat less important. However, other issues may be problematic. For example, if the system throws an exception while resizing an array, what should happen? Many languages, however, are more permissive when it comes to array checking. They allow the programmer the freedom to make mistakes. And, very often, those mistakes often come at the boundaries. In some operating and runtime systems, an attempt to access memory beyond the bounds of the array will immediately result in a runtime failure that is immediately visible. But not all. A common problem occurs when the run- time system allows a write action to memory not owned by the same process or thread that owns the array. The data being held in that location beyond the array are overwritten. When the actual owning process subsequently tries to use the overwritten memory, the failure may then become evident (i.e., total failure or freeze up) or the memory’s contents might be used in a calculation (resulting in the wrong answer), or it might be stored in a file or database, corrupting the data store silently. None of those are attractive results; thus worthy of testing. 3. Those created and sized at compile time
    • 4.2 Specification-Based 123There are also security issues to this array overwrite that we will discuss in alater chapter. Multidimensional arrays, where an item is accessed through two or moreindices, can be arbitrarily complex, making all of the issues mentioned aboveeven worse. Dynamic analysis tools can be useful when testing arrays; they can catch thefailure at the first touch beyond the bounds of the array rather than waiting forthe symptom of the failure to show. We will discuss dynamic analysis tools laterin this chapter and in chapter 9.4.2.2.2 Non-functional BoundariesBoundaries are sometimes simple to find. For example, when dealing with sim-ple data types like integers, where the values are ordered, there is no need toguess. The boundaries are self-evident. Other boundaries may be somewhatmurky as you will see when we discuss integer and floating point data types. And then there are times when a boundary is only implicitly defined andthe tester must settle on ways to test it. Consider the following requirement: The maximum file size is 16 MB.Exactly how many bytes is 16 MB? How do we create a 16 MB file to test themaximum? How do we test the invalid high boundary? The answer to the firstquestion as to the number of bytes is...it depends. What is the operating system?What is the file system under use: FAT32, NTFS? What are the sector and clus-ter sizes on the hard disk? Testing the exact valid high boundary on a 16 MB fileis very difficult, as is testing the invalid high boundary. When we are testingsuch a system—assuming it is not a critical or safety-related system—we willoften not spend time trying to ascertain the exact size but will test a file as closeto 16 MB as we can get for the valid high boundary and perhaps a 17 MB file forthe invalid high boundary. On the other end of the file size, the invalid low boundary is relatively easyto comprehend: there is none. No file can be minus one (-1) byte. But perhapsthe invalid low must be compared to the valid low size. Is it zero bytes? Somefile systems and operating systems allow zero-sized files, and therefore there isno invalid low boundary size file possible. If the file is a structured file, withdefined header information, the application that created it likely will not allow azero-byte file, and therefore an invalid low boundary is testable.
    • 124 4 Test Techniques In our experience, you should take these types of questions to the develop- ment team if they were not already defined in the design specification. Be aware, however, that often the developers have not given sufficient thought to some of these questions; their answers might be incorrect. Often, experiment- ing is required once the system arrives. As a separate example, assume the following efficiency requirement: The system shall process at least 150 transactions per second. An invalid low boundary may not exist. It would, of course, be impossible to trigger a negative number of transactions; however, if the system is designed such that a minimum number of transactions must flow within a given unit of time, there would be both valid and invalid low boundaries. Defining the mini- mum should be approached as we did the file example. Is a valid high really a boundary though? Clearly we must test 150 transac- tions per second, consistent with good performance testing—we will discuss that later in the book. What would be an invalid high? In a case like this, we would argue that there is no specific high invalid boundary to test; however, good performance testing would continue to ramp up the number of transac- tions until we see an appreciable change in performance. The point of this discussion is that boundaries are really useful for testing; sometimes you just must search a bit to determine what those boundaries are. 4.2.2.3 A Closer Look at Functional Boundaries In the Foundation syllabus, we discussed a few boundary examples in a some- what simplistic way. For an integer where the valid range is 1 to 10 inclusive, the boundaries were discussed as if there were four. These are shown on the number line in figure 4-9 as 0 (invalid low), 1 (valid low), 10 (valid high), and 11 (invalid high). ?? 0 1 10 11 ?? Figure 4–9 Boundaries for integers However, the closer you look, the more complicated it actually gets. Looking at this number line, there appear to be two missing values that we did not discuss: on the far left and far right. In math, we would say that the two missing bound-
    • 4.2 Specification-Based 125aries represent negative infinity on the left, positive infinity on the right. Clearlya computer cannot represent infinity using integers, as that would take an infi-nite number of bytes. That means there are going to be specific values that areboth the largest and the smallest values that the computer can represent.Boundaries! In order to understand how these boundaries are actually repre-sented—and where errors can occur—we need to look at some computer archi-tectural details. Exactly how are numbers stored in memory? Technical test analysts must understand how different values are repre-sented in memory; we will look at this topic for both integer and floating pointnumbers.4.2.2.4 IntegersIn most computer architectures, integers are stored differently than floatingpoint numbers (those numbers that have fractional values). The maximum sizeof an integer—and hence the boundaries—(in most cases) is defined by twothings: the number of bits that are allocated for the number and whether it issigned or unsigned. In these architectures, an integer is encoded using the binary number sys-tem, a combination of ones and zeroes. The width or precision of the number isthe number of bits in its representation. A number with n bits can encode 2ndifferent numeric values. If the number is signed, the left-most bit represents the sign; this has theeffect of making approximately half of the possible values positive and the otherhalf negative. For example, assume that 8 bits are used to store the number. Therange of the unsigned number is from 0 (0000 0000) to +255 (1111 1111). Therange of the signed number is from -128 (1000 0000) to +127 (0111 1111). Ifthose actual bit representations look strange, it is because in most computers,negative numbers are represented as two’s complements. A two’s complement number is represented as the value obtained by sub-tracting the number from a large power of two. To get the two’s complement ofan 8-bit value, the value (in binary) is subtracted from 28. The full math andlogic behind using two’s complement encoding for negative numbers is beyondthe scope of this book. The important concept is that the boundaries can bedetermined for use in testing.
    • 126 4 Test Techniques Figure 4–10 Whole number representations4 4 Figure 4-10 shows several possible number sizes and the values that can be represented. Many mainframes (and some databases) use a different encoding scheme called binary-coded decimals (BCDs). Each digit in a represented number takes up 4 bits. The possible maximum length of a number is proportional to the number of bytes allowed by the architecture. So, if the computer allowed 1,000 bytes for a single number, that number could be 2,000 digits long. By writing 4. Wikipedia at http://en.wikipedia.org/wiki/Integer_(computer_science)
    • 4.2 Specification-Based 127your own handler routines, you could conceivably have numbers of any magni-tude in any programming language. The issue would then become computa-tional speed—how long does it take to perform mathematical calculations. Note that the binary representation is slightly more efficient than the BCDrepresentation as far as the relative size of the number that can be stored. A 128-bit, unsigned value (16 bytes) can hold up to 39 digits, while you would need aBCD of 20 bytes to hold the same size number. While this size differential is nothuge, the speed of processing may become paramount; BCD calculations tendto be slower than calculations for built-in data types because the calculationmust be done digit by digit. Most programming languages have a number of different width wholenumbers that can be selected by the programmer. The typical default value inte-ger in 16-bit computers was 16 bits; in most 32-bit computers the default is32 bits wide. For example, in the Delphi programming language, a short is16 bits, an int is 32 bits, and an int64 is 64 bits. A special library can be used inDelphi to support 128-bit numbers. When the programmer declares a variable,they define the length and whether the value is signed or unsigned. While a pro-grammer could conceivably use the largest possible value for each number tominimize the chance of overflow, that has always been considered a bad pro-gramming technique that wastes space and often slows down computations. In C as it was originally defined for DEC minicomputers, the language onlyguaranteed that an int was at least as long (if not longer) as a short and a longwas at least as long (if not longer) as an int. On the PDP-11, a short was 16 bits,an int 32 bits, and a long also 32 bits. Other architectures gave other lengths, andeven with modern 64-bit CPUs, you might not always know how many bitsyou’ll get with your int. This ambiguity about precision in C has created—andcontinues to create—a lot of bugs and thus is fertile ground for testing. By looking at the definitions of the data variables used, a technical testercan discern the length of the value and thence the boundaries. Since there isalways a defined maxint (the maximum size integer) and minint (the maximummagnitude negative number)—no matter which type variable is used—it is rela-tively straightforward to find and test the boundaries. Assuming an 8-bit,unsigned integer 255 (1111 1111), adding 1 to it would result in an overflow,giving the value 0 (0000 0000). 255 + 1 = 0. Probably not what the programmerintended...
    • 128 4 Test Techniques Since any value inputted to the system may be used in an expression where a value is added to it, testing with very large and very small numbers based on the representation used by the programmer should be considered a good test technique. Because division and multiplication happens too, be sure to test zero, which is always an interesting boundary value. 4.2.2.5 Floating Point Numbers The other types of numbers we need to discuss are floating point (floats). These numbers allow a fractional value; some number of digits before and some num- ber of digits after a decimal point. As before, a technical tester needs to investi- gate the architecture and selections made by programmers when testing these types of numbers. These numbers are represented by a specific number of sig- nificant digits and scaled by using an exponent. This is the computer equivalent of scientific notation. The base is usually 2 but could be 10 or 8. Like scientific notation, floats are used to represent very large or very small numbers. For example, a light year is approximately 9.460536207 × 1015 meters. This value is a bit more than even a really large integer can represent. Likewise, an angstrom is approx 1 × 10-15 meters, a very tiny number. Just to get the terminology correct, a floating point number consists of a signed digit string of a given length in a given base. The signed string (as shown in the preceding paragraph, 9.460536207) is called the significand (or some- times the mantissa). The length of the significand determines the precision that the float can represent—that is, the number of significant digits. The decimal point, sometimes called the radix, is usually understood to be in a specific place—usually to the right of the first digit of the significand. The base is usually set by the architecture (usually 2 or 10). The exponent (also called the charac- teristic or scale) modifies the magnitude of the number. In the preceding para- graph, in the first example, the exponent is 15. That means that the number is 9.460536207 times 10 to the 15th power. Like integers, floats can also be represented by binary coded decimals. These are sometimes called fixed-point decimals and are actually undistinguish- able from integer BCDs. In other words, the decimal point is not stored in the value; instead, its relative location is understood by the programmer and com- piler. For example, assume a 4-byte value: 12 34 56 7C. In fixed-point notation,
    • 4.2 Specification-Based 129this value might represent +1,234.567 if the compiler understands the decimalpoint to be between the fourth and fifth digits. In general, non-integer numbers have a far wider range than integers whiletaking up only a few more bytes. Different architectures have a variety ofschemes for storing these values. Like integers, floats come in several differentsizes; one common naming terminology is 16 bit (halfs), 32 bit (singles), 64 bit(doubles) and 128 bit (quadruples). Over the years, a wide variety of schemeshave been used to represent these values; IEEE 754 has standardized most ofthem. This standard, first adopted in 1985, is followed by almost every com-puter manufacturer with the exception of IBM mainframes and Cray supercom-puters. The larger the representation, the larger (and smaller) the numbers thatcan be stored.Figure 4–11 IEEE 754-2008 float representationsAs shown in figure 4-11, a float contains a sign (positive or negative), an expo-nent of certain size, and a significand of certain size. The magnitude of the sig-nificand determines the number of significant figures the representation canhold (called the precision). Note that the bits precision is actually one more thanthe significand can hold. For complex reasons having to do with the way floatsare used in calculations, the leading bit of the significand is implicit—that is, notstored. That adds the extra bit of precision. Related to that is the bias of theexponent (bias is used in the engineering sense, as in offset). The value of theexponent is offset by a given value to allow it to be represented by an unsignednumber even though it actually represents both large (positive) and small (neg-ative) exponents. A key issue with these values that must be understood by technical testers isthat most floats are not exact; instead, they are approximations of the value wemay be interested in. In many cases, this is not an issue. Occasionally, however,interesting bugs can lurk in these approximate values.
    • 130 4 Test Techniques When values are calculated, the result is often an irrational number. The definition of an irrational number is any real number that cannot be expressed as the fraction a/b where a and b are integers. Since an irrational number, by definition, has an infinite number of decimal digits (i.e., is a non-terminating number), there is no way to represent it exactly by any size floating point num- ber. That means anytime an operation occurs with irrational values, the result will be irrational (and hence approximate). There are many irrational numbers that are very important to mathematic concepts, including pi and the square root of 2. It is not that a floating point value cannot be exact. Whole numbers may be represented exactly. The representation for 734 would be 7.34 × 102. However, when calculations are made using floating point values, the exactness of the result might depend on how the values were originally set. It is probably safe to hold, as a rule of thumb, that any floating number is only close, an approxima- tion of the mathematically correct value. To the tester, close may or may not be problematic. The longer a chain of calculations using floats, the more likely the values will drift from exactness. Likewise, rounding and truncating can cause errors. Consider the following chain of events. Two values are calculated and placed into floating point vari- ables; one comes close to 6, the other close to 2. Very close; for all intents and purposes the values are almost exactly 6 and 2. The second value is divided into t he f i rst , resu lt i ng i n a v a lu e cl o s e t o 3 . Pe r h ap s s om e t h i ng l i ke 2.999999999999932. Now, suppose the value is placed into an integer rather than a float—the compiler will allow that. Suppose the developer decides that they do not want the fractional value and, rather than rounding, uses the trun- cation function. Essentially, following along this logic, we get 6 divided by 2 equals 2. Oops. If this sounds far-fetched, it is. But this particular error actu- ally happened to software that we were testing, and we did a lot of head scratch- ing digging it out. Each step the programmer made was sensible, even if the answer was nonsensical. Clearly, boundaries of floating point numbers—and calculations made with them—need to be tested. 4.2.2.6 Testing Floating Point Numbers Testing of financial operations involves floating point numbers, so all of the issues just discussed apply. Notice that the question of precision has serious
    • 4.2 Specification-Based 131real-world implications. People tend to be very sensitive to rounding errors thatinvolve things like money, investment account balances, and the like. Further-more, there are regulations and contracts that often apply, along with any speci-fied requirements. So testing financial applications involves a great deal of carewhen these kinds of issues are involved. Not Number EP {letters, punctuation, null, ...} Equivalence Partitioning Number BVA # of price per commis- shares share sion Invalid Valid Valid Valid Invalid Valid? Valid Valid Valid Invalid (negative) (zero) (round) (no round) (too large) (negative) (zero) (round) (no round) (too large) -max 0 max -max 0 max -0.000 000 000 1 0.000001 1,000,000,000.00 -0.005 0.001 10,000,000.00 0.000 000 499 99 999,999,999.99 0.0049999 9,999,999.99 0.000 000 5 0.005 0.000 000 9 0.009Figure 4–12 A stock-purchase screen from QuickenIn figure 4-12, you can see the application of equivalence partitioning andboundary value analysis to the input of a stock purchase transaction inQuicken. In this application, when you enter a stock purchase transaction, afterentering the stock, you input the number of shares, the price per share, and thecommission, if any. Quicken calculates the amount of the transaction. First, for each of the three fields—number of shares, price per share, andcommission—we can say that any non-numeric input is invalid. So that’s com-mon across all three. For the numeric inputs, the price per share and the number of shares arehandled the same. The number must be zero or greater. Look closely at all of theboundaries that are shown in figure 4-12. Some of these boundaries are not nec-essarily intuitive. When boundary testing with real numbers, there are twocomplexities we must deal with: precision and round-off.
    • 132 4 Test Techniques By precision, in this case we are talking about the number of decimal digits we are willing to test. Many people refer to this as the epsilon (or smallest recog- nizable difference) to which we will test. Suppose we have a change in system behavior exactly at the value 10.00. Values below are valid, above are invalid. What is the closest possible value that you can reasonably say is the invalid boundary? 10.01? Clearly we can get many values between 10.00 and 10.01, including 10.001, 10.0001, 10.00001... Somewhere in testing we have to decide the boundary beyond which it makes no sense to test. One such line is the epsilon, and it generally represents the number of decimal digits to which we will test. In the preceding example, we might decide that we want to test to an epsilon of 2. That would make our invalid boundary 10.01, even if, in reality, it is not a physical boundary. Testers must understand the epsilon to which they will be testing. This epsilon may be decided by the programmer who wrote the code, the designer who designed it, or even the computer representation used. Likewise, rounding off is an important facet of real-world boundaries. Again, look at figure 4-12. It is possible to buy a fractional portion of a share of stock. Clearly, if we put 0 (zero) in the number of shares purchased, we will buy no stock. You might think that if we put a positive non-zero value in, we would automatically be purchasing some stock. However, some values will round down to zero, even though the absolute value is greater than zero. And therein lies our boundary. We should find out exactly where the boundary is, that is, where the behavior changes (with due respect to the epsilon). In this version of Quicken, Rex discovered through trial and error that the smallest amount of a share that can be purchased is 0.000 001 shares. This is a value where behavior changes. However, the testable boundary must allow for rounding. The value 0.000 000 5 will round up to 0.000 001. That makes it the low valid boundary. But we need to go to more decimal places to allow the round-off to occur, down to the epsilon. Hence the low invalid boundary is some value less than that which will round down to zero; in this case we select 0.000 000 499 99. 4.2.2.7 How Many Boundaries? Let’s wind down our discussion of boundary value analysis by mentioning a somewhat obscure point that can nevertheless arise if you do any reading
    • 4.2 Specification-Based 133about testing. This is the question of how many boundary values exist at aboundary. In the material so far, we’ve shown just two boundary values per boundary.The boundary lies between the largest member of one equivalence class and thesmallest member of the equivalence class above it. In other words, the boundaryitself doesn’t correspond to any member of any class. However, some authors, of whom Boris Beizer is probably the most notable,say there are three boundary values. Why would there be three? The user must order a quantity greater than 0 and less than 100... Mathematical inequality model (v1) 0 < x < 100 -1 0 1 99 100 101 Graphical (“number line”) model Invalid – Invalid – too low too high 0 1 x 99 100 Mathematical inequality model (v2) 1 =< x =< 99 0 1 2 98 99 100Figure 4–13 Two ways to look at boundariesThe difference arises from the use of a mathematical analysis rather than thegraphical one that we’ve used, as shown in figure 4-13. You can see that the twomathematical inequalities shown in figure 4-13 describe the same situation asthe graphical model. Applying Beizer’s methodology, you select the value itselfas the middle boundary value, then you add the smallest possible amount tothat value and subtract the smallest possible amount from that value to generatethe two other boundary values. Remember, with integers, as in this example, thesmallest possible amount is one, while for floating points we have to considerthose pesky issues of precision and round-off.
    • 134 4 Test Techniques As you can see from this example, the mathematical boundary values will always include the graphical boundary values. To us, the three-value approach is wasted effort. We have enough to do already without creating more test values to cover, especially when they don’t address any risks that haven’t already been covered by other test values. 4.2.2.8 Boundary Value Exercise Loan amount {Enter a loan amount} Whole dollar amounts (no cents). Will be rounded to nearest $ 100. Property value? {Enter the property’s value} Whole dollar amounts (no cents). Will be rounded to nearest $ 100. Figure 4–14 HELLOCARMS system amount screen prototype A screen prototype for one screen of the HELLOCARMS system is shown in figure 4-14. This screen asks for two pieces of information: ■ Loan amount ■ Property value For both fields, the system allows entry of whole dollar amounts only (no cents), and it rounds to the nearest $100. Assume the following rules apply to loans: ■ The minimum loan amount is $5,000. ■ The maximum loan amount is $1,000,000. ■ The minimum property value is $25,000. ■ The maximum property value is $5,000,000. Refer also to requirements specification elements 010-010-130 and 010-010-140 in the Functional System Requirements section of the HELLOCARMS system requirements document.
    • 4.2 Specification-Based 135 If the fields are valid, then the screen will be accepted. Either the TelephoneBanker will continue with the application or the application will be transferredto a Senior Telephone Banker for further handling. If one or both fields areinvalid, an error message is displayed.The exercise consists of two parts:1. Show the equivalence partitions and boundary values for each of the two fields, indicating valid and invalid members and the boundaries for those partitions.2. Create test cases to cover these partitions and boundary values, keeping in mind the rules about combinations of valid and invalid members.The answers to the two parts are shown on the following pages. You shouldreview the answer to the first part (and, if necessary, revise your answer to thesecond part) before reviewing the answer to the second part.4.2.2.9 Boundary Value Exercise DebriefFirst, let’s take a look at the equivalence partitions and boundary values, whichare shown in figure 4-15. Not Integer EP ABC 50000.01 null Loan Amount EP Invalid Invalid Invalid Valid Valid Invalid (negative) (zero) (too low) (no xfer) (xfer) (too high) Integer BVA -max -1 0 49 50 4,949 max 4,950 500,049 1,000,050 500,050 1,000,049 Not Integer EP abc 999777.01 null Property Value EP Invalid Invalid Invalid Valid Valid Invalid (negative) (zero) (too low) (no xfer) (xfer) (too high) Integer BVA -max -1 0 49 50 24,949 max 24,950 5,000,050 1,000,049 No 1,000,050 5,000,049 Property Value EP Yes EP For Loan For Value For BothFigure 4–15 Equivalence partitions (EP) and boundary values (BVA) for exercise
    • 136 4 Test Techniques So, for the loan amount we can show the boundary values and equivalence par- titions as shown in table 4-8. Table 4–8 # Partition Boundary Value 1 Letter: ABC - 2 Decimal: 50,000.01 - 3 Null - 4 Invalid (negative) -max 5 Invalid (negative) -1 6 Invalid (zero) 0 7 Invalid (zero) 49 8 Invalid (too low) 50 9 Invalid (too low) 4,949 10 Valid (no transfer) 4,950 11 Valid (no transfer) 500,049 12 Valid (transfer) 500,050 13 Valid (transfer) 1,000,049 14 Invalid (too high) 1,000,050 15 Invalid (too high) max For the property value, we can show the boundary values and equivalence parti- tions as shown in table 4-9. Table 4–9 # Partition Boundary Value 1 Letter: abc - 2 Decimal: 999,777.01 - 3 Null - 4 Invalid (negative) -max 5 Invalid (negative) -1 6 Invalid (zero) 0 7 Invalid (zero) 49 8 Invalid (too low) 50 9 Invalid (too low) 24,949 10 Valid (no transfer) 24,950 11 Valid (no transfer) 1,000,049 12 Valid (transfer) 1,000,050 13 Valid (transfer) 5,000,049 14 Invalid (too high) 5,000,050 15 Invalid (too high) max
    • 4.2 Specification-Based 137Make sure you understand why these values are boundaries based on round-offrules given in the requirements. For the transfer decision, we can show theequivalence partitions for the loan amount as shown in table 4-10.Table 4–10 # Partition 1 No 2 Yes, for the loan amount 3 Yes, for the property value 4 Yes, for bothNow, let’s create tests from these equivalence partitions and boundary values.We’ll capture traceability information from the test case number back to thepartitions or boundary values, and as before, once we have a trace from eachpartition to a test case, we’re done—as long as we didn’t combine invalid values!Table 4–11 Inputs 1 2 3 4 5 6 Loan amount 4,950 500,050 500,049 1,000,049 ABC 50,000.01 Property value 24,950 1,000,049 1,000,050 5,000,049 100,000 200,000 Outputs Accept? Y Y Y Y N N Transfer? N Y (loan) Y (prop) Y (both) - - Inputs 7 8 9 10 11 12 Loan amount null 100,000 200,000 300,000 -max -1 Property value 300,000 abc 999,777.01 null 400,000 500,000 Outputs Accept? N N N N N N Transfer? - - - - - - Inputs 13 14 15 16 17 18 Loan amount 0 49 50 4,949 1,000,050 max Property value 600,000 700,000 800,000 900,000 1,000,000 1,100,000 Outputs Accept? N N N N N N Transfer? - - - - - -
    • 138 4 Test Techniques Inputs 19 20 21 22 23 24 Loan amount 400,000 500,000 600,000 700,000 800,000 900,000 Property value -max -1 0 49 50 24,949 Outputs Accept? N N N N N N Transfer? - - - - - - Inputs 25 26 27 28 29 30 Loan amount 1,000,000 555,555 Property value 5,000,050 max Outputs Accept? N N Transfer? - - Table 4–12 # Partition Boundary Value Test Case 1 Letter: ABC - 5 2 Decimal: 50,000.01 - 6 3 Null - 7 4 Invalid (negative) -max 11 5 Invalid (negative) -1 12 6 Invalid (zero) 0 13 7 Invalid (zero) 49 14 8 Invalid (too low) 50 15 9 Invalid (too low) 4,949 16 10 Valid (no transfer) 4,950 1 11 Valid (no transfer) 500,049 3 12 Valid (transfer) 500,050 2 13 Valid (transfer) 1,000,049 4 14 Invalid (too high) 1,000,050 17 15 Invalid (too high) max 18
    • 4.2 Specification-Based 139Table 4–13 # Partition Boundary Value Test Case 1 Letter: abc - 8 2 Decimal: 999,777.01 - 9 3 Null - 10 4 Invalid (negative) -max 19 5 Invalid (negative) -1 20 6 Invalid (zero) 0 21 7 Invalid (zero) 49 22 8 Invalid (too low) 50 23 9 Invalid (too low) 24,949 24 10 Valid (no transfer) 24,950 1 11 Valid (no transfer) 1,000,049 2 12 Valid (transfer) 1,000,050 3 13 Valid (transfer) 5,000,049 4 14 Invalid (too high) 5,000,050 25 15 Invalid (too high) max 26Table 4–14 # Partition Test Case 1 No 1 2 Yes, for the loan amount 2 3 Yes, for the property value 3 4 Yes, for both 4Notice that’s there’s another interesting combination related to the transfer deci-sion that we covered in our tests. This was when the values were rejected asinputs, in which case we should not even be able to leave the screen, not to men-tion transfer the application. We did test with both loan amounts and propertyvalues that would have triggered a transfer had the other value been valid. Wecould have shown that as a third set of equivalence classes for the transfer deci-sion.
    • 140 4 Test Techniques ISTQB Glossary decision table: A table showing combinations of inputs and/or stimuli (causes) with their associated outputs and/or actions (effects), which can be used to design test cases. decision table testing: A black-box test design technique in which test cases are designed to execute the combinations of inputs and/or stimuli (causes) shown in a decision table. 4.2.3 Decision Tables Equivalence partitioning and boundary value analysis are very useful tech- niques. They are especially useful, as you saw in the earlier parts of this section, when testing input field validation at the user interface. However, lots of testing that we do as technical test analysts involves testing the business logic that sits underneath the user interface. We can use boundary values and equivalence partitioning on business logic too, but two additional techniques, decision tables and state-based testing, will often prove handier and more powerful. Let’s start with decision tables. Conceptually, decision tables express the rules that govern handling of transac- tional situations. By their simple, concise structure, they make it easy for us to design tests for those rules, usually at least one test per rule. When we said “transactional situations,” what we meant was those situa- tions where the conditions—inputs, preconditions, etc.—that exist at a given moment in time for a single transaction are sufficient by themselves to deter- mine the actions the system should take. If the conditions on their own are not sufficient, but we must also refer to what conditions have existed in the past, then we’ll want to use state-based testing, which we’ll cover later in this chapter. The underlying model is a table. The model connects combinations of con- ditions with the action or actions that should occur when each particular com- bination of conditions arises. To create test cases from a decision table, we are going to design test inputs that fulfill the conditions given. The test outputs will correspond to the action or actions given for that combination of conditions. During test execution, we check that the actual actions taken correspond to the expected actions.
    • 4.2 Specification-Based 141 We create enough test cases that every combination of conditions is coveredby at least one test case. Oftentimes, that coverage criterion is relaxed to ensurethat we cover those combinations of conditions that can determine the action oractions. If that’s a little confusing—which it might be, depending on how youprepared for the Foundation exam, because this isn’t always explained properlyin books and classes—the distinction we’re drawing will become clear to youwhen we talk about collapsed decision tables. With a decision table, the coverage criterion boils down to an easy-to-remember rule of at least one test per column in the table. So, what is our bug hypothesis with decision tables? What kind of bugs arewe looking for? There are several. First, under some combination of conditions,the wrong action might occur. In other words, there may be some action thatthe system is not to take under this combination of conditions, yet it does. Sec-ond, under some combination of conditions, the system might not take theright action. In other words, there is some action that the system is to takeunder this combination of conditions, yet it does not. Beyond those two possibilities, there is also a very real positive value ofusing decision tables during the analysis portion of our testing. A decision tableforces testers to consider many possibilities that we may not have consideredotherwise. By considering all of the permutations of conditions possible, even ifwe don’t test them all, we may discover some unknown unknowns. That is, weare likely to find some scenarios that the analysts and developers had not con-sidered. By considering these scenarios, we can not only capture incipientdefects but avoid later disagreements about correct behavior.Table 4–15 Conditions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Real account? Y Y Y Y Y Y Y Y N N N N N N N N Active account? Y Y Y Y N N N N Y Y Y Y N N N N Within limit? Y Y N N Y Y N N Y Y N N Y Y N N Location okay? Y N Y N Y N Y N Y N Y N Y N Y N Actions Approve Y N N N N N N N N N N N N N N N Call N Y Y Y N Y Y Y N N N N N N N N cardholder? Call vendor? N N N N Y Y Y Y Y Y Y Y Y Y Y Y
    • 142 4 Test Techniques Consider the decision table shown in table 4-15. This table shows the decisions being made at an e-commerce website about whether to accept a credit card purchase of merchandise. Once the information goes to the credit card process- ing company for validation, how can we test their decisions? We could handle that with equivalence partitioning, but there is actually a whole set of conditions that determine this processing: ■ Does the named person hold the credit card entered, and is the other information correct? ■ Is it still active or has it been cancelled? ■ Is the person within or over their limit? ■ Is the transaction coming from a normal or a suspicious location? The decision table in table 4-15 shows how these four conditions interact to determine which of the following three actions will occur: ■ Should we approve the transaction? ■ Should we call the cardholder (e.g., to warn them about a purchase from a strange place)? ■ Should we call the vendor (e.g., to ask them to seize the cancelled card)? Take a minute to study the table to see how this works. The conditions are listed at the top left of the table, and the actions at the bottom left. Each column to the right of this leftmost column contains a business rule. Each rule says, in essence, “Under this particular combination of conditions (shown at the top of the rule), carry out this particular combination of actions (shown at the bottom of the rule).” Notice that the number of columns—i.e., the number of business rules—is equal to 2 (two) raised to the power of the number of conditions. In other words, 2 raised to the 4th power (based on four conditions), which results in 16 columns. When the conditions are strictly Boolean—true or false—and we’re dealing with a full decision table (not a collapsed one), that will always be the case. If one or more of the conditions are not rendered as Boolean, then the number of columns will not follow this rule. Did you notice how we populated the conditions? The topmost condition changes most slowly. Half of the columns are Yes, then half No. The condition under the topmost changes more quickly, but it changes more slowly than all the
    • 4.2 Specification-Based 143other conditions below it. The pattern is a quarter Yes, then a quarter No, then aquarter Yes, then a quarter No. Finally, for the bottommost condition, the alter-nation is Yes, No, Yes, No, Yes, etc. This pattern makes it easy to ensure that youdon’t miss anything. If you start with the topmost condition, set the left half ofthe rule columns to Yes and the right half of the rule columns to No, then fol-lowing the pattern we showed, if you get to the bottom and the Yes, No, Yes, No,Yes, etc. pattern doesn’t hold, you did something wrong. The other algorithmthat you can use, again assuming that all conditions are Boolean, is to count thecolumns in binary. The first column is YYYY, second YYYN, third YYNY,fourth YYNN, etc. If you are not familiar with binary arithmetic, forget wementioned it. Deriving test cases from this example is relatively easy: each column (busi-ness rule) of the table generates a test case. When time comes to run the tests,we’ll create the conditions that are each test’s inputs. We’ll replace the yes/noconditions with actual input values for credit card number, security code, expi-ration date, and cardholder name, either during test design or perhaps even attest execution time. We’ll verify the actions that are the test’s expected results. In some cases, we might generate more than one test case per column. We’llcover this possibility later, as we enlist our previous test techniques, equivalencepartitioning and boundary value analysis, to extend decision table testing. Notice that, in this case, some of the test cases don’t make much sense. Forexample, how can the account not be real but yet active? How can the accountnot be real but within the given limit? This kind of situation is a hint that maybe we don’t need all the columns inour decision table.4.2.3.1 Collapsing Columns in the TableWe can sometimes collapse the decision table, combining columns, to achieve amore concise—and in some cases more sensible—decision table. In any situa-tion where the value of one or more particular conditions can’t affect the actionsfor two or more combinations of conditions, we can collapse the decision table. This involves combining two or more columns where, as we said, one ormore of the conditions don’t affect the actions. As a hint, combinable columnsare often but not always next to each other. You can at least start by looking atcolumns next to each other.
    • 144 4 Test Techniques To combine two or more columns, look for two or more columns that result in the same combination of actions. Note that the actions in the two columns must be the same for all of the actions in the table, not just some of them. In these columns, some of the conditions will be the same, and some will be differ- ent. The ones that are different obviously don’t affect the outcome. We can replace the conditions that are different in those columns with a dash. The dash usually means we don’t care, it doesn’t matter, or it can’t happen, given the other conditions. Now, repeat this process until the only columns that share the same combi- nation of actions for all the actions in the table are ones where you’d be combin- ing a dash with a Yes or No value and thus wiping out an important distinction for cause of action. What we mean by this will be clear in the example on the next page, if it’s not clear already. Another word of caution at this point: Be careful when dealing with a table where more than one rule can apply at one single point in time. These tables have nonexclusive rules. We’ll discuss that further later in this section. Table 4–16 Conditions 1 2 3 5 6 7 9 Real account? Y Y Y Y Y Y N Active account? Y Y Y N N N - Within limit? Y Y N Y Y N - Location okay? Y N - Y N - - Actions Approve Y N N N N N N Call card holder? N Y Y N Y Y N Call vendor? N N N Y Y Y Y Table 4-16 shows the same decision table as before, but collapsed to eliminate extraneous columns. Most notably, you can see that what were columns 9 through 16 in the original decision table collapsed into a single column. Notice that all of the columns 9 through 16 are essentially the same test: Is this for a real account? If it is not real (note that for all columns, 9 through 16, the answer is no), then it certainly cannot be active, within limit, or have an okay location. Since those conditions are impossible, we would essentially be running the same test eight times.
    • 4.2 Specification-Based 145 We’ve kept the original column numbers for ease of comparison. Again,take a minute to study the table to see how we did this. Look carefully at col-umns 1, 2, and 3. Notice that we can’t collapse 2 and 3 because that would resultin “dash” for both “within limit” and “location okay.” If you study this table orthe full one, you can see that one of these conditions must not be true for thecardholder to receive a call. The collapse of rule 4 into rule 3 says that, if thecard is over limit, the cardholder will be called, regardless of location. The same logic applies to the collapse of rule 8 into rule 7. Notice that the format is unchanged. The conditions are listed at the top leftof the table, and the actions at the bottom left. Each column to the right of thisleftmost column contains a business rule. Each rule says, “Under this particularcombination of conditions (shown at the top of the rule, some of which mightnot be applicable), carry out this particular combination of actions (shown atthe bottom of the rule, all of which are fully specified).” Notice that the number of columns is no longer equal to 2 raised to thepower of the number of conditions. This makes sense, since otherwise no col-lapsing would have occurred. If you are concerned that you might miss some-thing important, you can always start with the full decision table. In a full table,because of the way you generate it, it is guaranteed to have all the combinationsof conditions. You can mathematically check if it does. Then, carefully collapsethe table to reduce the number of test cases you create. Also, notice that, when you collapse the table, that pleasant pattern of Yesand No columns present in the full table goes away. This is yet another reason tobe very careful when collapsing the columns, because you can’t count on thepattern or the mathematical formula to check your work.4.2.3.2 Combining Decision Table Testing with Other TechniquesOkay, let’s address an issue we brought up earlier, the possibility of multiple testcases per column in the decision table via the combination of equivalence parti-tioning and the decision table technique. In figure 4-16, we refer to our exampledecision table of table 4-16, specifically column 9.
    • 146 4 Test Techniques Conditions 9 Number/ Two Name mismatch Real account? N EP Number/ Two Active account? - Expiry mismatch Within limit? - Three Number/ Two Location okay? - mismatch CSC mismatch Figure 4–16 Equivalence partitions and decision tables We can apply equivalence partitioning to the question, How many interesting— from a test point of view—ways are there to have an account not be real? As you can see from figure 4-16, this could happen seven potentially interesting ways: ■ Card number and cardholder mismatch ■ Card number and expiry mismatch ■ Card number and CSC (card security code) mismatch ■ Two of the above mismatches (three possibilities) ■ All three mismatches So, there could be seven tests for that column. How about boundary value analysis? Yes, that too can be applied to decision tables to find new and interesting tests. For example, How many interesting test values relate to the credit limit? Just over Zero before Normal after At limit after limit after At limit before Max after transaction transaction transaction transaction transaction transaction Conditions 1 2 3 5 6 7 Real account? Y Y Y Y Y Y EP EP Active account? Y Y Y N N N Within limit Over limit Within limit? Y Y N Y Y N BVA 0 limit limit+0.01 max Location okay? Y N - Y N - Figure 4–17 Boundary values and decision tables As you can see from figure 4-17, equivalence partitioning and boundary value analysis show us six interesting possibilities: 1. The account starts at zero balance. 2. The account would be at a normal balance after transaction.
    • 4.2 Specification-Based 1473. The account would be exactly at the limit after the transaction.4. The account would be over the limit after the transaction.5. The account was at exactly the limit before the transaction (which would ensure going over if the transaction concluded).6. The account would be at the maximum overdraft value after the transaction (which might not be possible).Combining this with the decision table, we can see that would again end upwith more “over limit” tests than we have columns—one more, to be exact—sowe’d increase the number of tests just slightly. In other words, there would befour within-limit tests and three over-limit tests. That’s true unless you wantedto make sure that each within-limit equivalence class was represented in anapproved transaction, in which case column 1 would go from one test to three.4.2.3.3 Nonexclusive Rules in Decision TablesLet’s finish our discussion about decision tables by looking at the issue of non-exclusive rules we mentioned earlier.Table 4–17 Conditions 1 2 3 Foreign exchange? Y - - Balance forward? - Y - Late payment? - - Y Actions Exchange fee? Y - - Charge interest? - Y - Charge late fee? - - YSometimes more than one rule can apply to a transaction. In table 4-17, you seea table that shows the calculation of credit card fees. There are three conditions,and notice that zero, one, two, or all three of those conditions could be met in agiven month. How does this situation affect testing? It complicates the testing a bit, but we can use a methodical approach andrisk-based testing to avoid the major pitfalls. To start with, test the decision table like a normal one, one rule at a time,making sure that no conditions not related to the rule you are testing are met.This allows you to test rules in isolation—just as you are forced to do in situa-tions where the rules are exclusive.
    • 148 4 Test Techniques Next, consider testing combinations of rules. Notice we said, “Consider,” not “test all possible combinations of rules.” You’ll want to avoid combinatorial explosions, which is what happens when testers start to test combinations of factors without consideration of the value of those tests. Now, in this case, there are only 8 possible combinations—three factors, two options for each factor, 2 times 2 times 2 is 8. However, if you have six factors with five options each, you would now have 15,625 combinations. One way to avoid combinatorial explosions is to identify the possible com- binations and then use risk to weight those combinations. Try to get to the important combinations and don’t worry about the rest. Another way to avoid combinatorial explosions is to use techniques like classification trees and pairwise testing, which are covered in Rex’s book Advanced Software Testing Vol. 1. 4.2.3.4 Decision Table Exercise During development, the HELLOCARMS project team added a feature to HEL- LOCARMS. This feature allows the system to sell a life insurance policy to cover the amount of a home equity loan so that, should the borrower die, the policy will pay off the loan. The premium is calculated annually, at the beginning of each annual policy period and based on the loan balance at that time. The base annual premium will be $1 for $10,000 in loan balance. The insurance policy is not available for lines of credit or for reverse mortgages. The system will increase the base premium by a certain percentage based on some basic physical and health questions that the Telephone Banker will ask during the interview. A Yes answer to any of the following questions will trigger a 50 percent increase to the base premium: 1. Have you smoked cigarettes in the past 12 months? 2. Have you ever been diagnosed with cancer, diabetes, high cholesterol, high blood pressure, a heart disorder, or stroke? 3. Within the last 5 years, have you been hospitalized for more than 72 hours except for childbirth or broken bones? 4. Within the last 5 years, have you been completely disabled from work for a week or longer due to a single illness or injury?
    • 4.2 Specification-Based 149The Telephone Banker will also ask about age, weight, and height. (Applicantscannot be under 18.) The weight and height are combined to calculate the bodymass index (BMI). Based on that information, the Telephone Banker will applythe rules in table 4-18 to decide whether to increase the rate or even decline toissue the policy based on possible weight-related illnesses in the person’s future.Table 4–18 Body Mass Index (BMI) Age <17 34-36 37-39 >39 18-39 Decline 75% 100% Decline 40-59 Decline 50% 75% Decline >59 Decline 25% 50% DeclineThe increases are cumulative. For example, if the person has normal weight,smokes cigarettes, and has high blood pressure, the annual rate is increasedfrom $1 per $10,000 to $2.255 per $10,000. If the person is a 45-year-old malediabetic with a body mass index of 39, the annual rate is increased from $1 per$10,000 to $2.6256 per $10,000.The exercise consists of three steps:1. Create a decision table that shows the effect of the four health questions and the body mass index.2. Show the boundary values for body mass index and age.3. Create test cases to cover the decision table and the boundary values, keep- ing in mind the rules about testing nonexclusive rules.The answers to the three parts are shown on the next pages. You should reviewthe answer to the each part (and, if necessary, revise your answer to the nextparts) before reviewing the answer to the next part.4.2.3.5 Decision Table Exercise DebriefFirst, we created the decision table from the four health questions and the BMI/age table. The answer is shown in table 4-19. Note that the increases are shownin percentages.5. Two risk factors; the calculation is as follows: ($1 x 50%) = $1.50 + ($1.50 x 50%) = $2.256. One risk factor plus BMI increase of 75%: ($1 x 50%) = $1.50 + ($1.50 x 75%) = $2.625
    • 150 4 Test Techniques Table 4–19 Conditions 1 2 3 4 5 6 7 8 9 10 11 12 Smoked? Y – – – – – – – – – – – Diagnosed? – Y – – – – – – – – – – Hospitalized? – – Y – – – – – – – – – Disabled? – – – Y – – – – – – – – BMI – – – – 34–36 34–36 34–36 37–39 37–39 37–39 <17 >39 Age – – – – 18–39 40–59 >59 18–39 40–59 >59 – – Actions Increase 50 50 50 50 75 50 25 100 75 50 – – Decline – – – – – – – – – – Y Y It’s important to notice that rules 1 through 4 are nonexclusive, though rules 5 through 12 are exclusive. In addition, there is an implicit rule that the age must be greater than 17 or the applicant will be denied not only insurance but the loan itself. We could have put that here in the decision table, but our focus is primarily on testing business functionality, not input validation. We’ll cover those tests with bound- ary values. Now, let’s look at the boundary values for body mass index and age, shown in figure 4-18. To o No Smaller Bigger To o thin increase increase increase heavy 0 16 17 33 34 3 6 37 39 40 max Too young Young Mid-aged Senior 0 17 18 3 9 40 9 59 60 max Figure 4–18 BMI and age boundary values Three important testing notes relate to the body mass index. First, the body mass index is not entered directly but rather by entering height and weight. Depending on the range and precision of these two fields, there could be dozens of ways to enter a given body mass index. Second, the maximum body mass index is achieved by entering the smallest possible height and the largest possi- ble weight. Third, we’d need to separately understand the boundary values for these two fields and make sure those were tested properly.
    • 4.2 Specification-Based 151 An important testing note relates to the age. You can see that we omittedequivalence classes related to invalid ages, such as negative ages and non-integerinput for ages. Again, our idea is that we’d need to separately test the input fieldvalidation. Here, our focus is on testing business logic. Finally, for both fields, we omit any attempt to figure out the maxima.Either someone will give us a requirements specification that tells us that duringtest design or we’ll try to ascertain it empirically during test execution. So, for the BMI, we can show the boundary values and equivalence parti-tions as shown in table 4-20.Table 4–20 # Partition Boundary Value 1 Too thin 0 2 Too thin 16 3 No increase 17 4 No increase 33 5 Smaller increase 34 6 Smaller increase 36 7 Bigger increase 37 8 Bigger increase 39 9 Too heavy 40 10 Too heavy maxFor the age, we can show the boundary values and equivalence partitions asshown in table 4-21.Table 4–21 # Partition Boundary Value 1 Too young 0 2 Too young 17 3 Young 18 4 Young 39 5 Mid-aged 40 6 Mid-aged 59 7 Senior 60 8 Senior max
    • 152 4 Test Techniques Finally, table 4-22 shows the test cases. They are much like the decision table, but note that we have shown the rate (in dollars per $10,000 of loan balance) rather than the percentage increase. Table 4–22 Test case Conditions 1 2 3 4 5 6 7 8 9 10 11 12 Smoked? Y N N N N N N N N N N N Diagnosed? N Y N N N N N N N N N N Hospitalized? N N Y N N N N N N N N N Disabled? N N N Y N N N N N N N N BMI N N N N 34 36 35 34 36 35 37 39 Age N N N N 18 39 40 59 60 max 20 30 Actions Rate 1.5 1.5 1.5 1.5 1.75 1.75 1.5 1.5 1.25 1.25 2 2 Decline N N N N N N N N N N N N Test case Conditions 13 14 15 16 17 18 19 20 21 22 23 24 Smoked? N N N N N N N N N N N N Diagnosed? N N N N N N N N N N N N Hospitalized? N N N N N N N N N N N N Disabled? N N N N N N N N N N N N BMI 38 37 39 38 16 40 0 max 17 33 20 30 Age 45 55 65 75 35 50 25 70 37 47 0 17 Actions Rate 1.75 1.75 1.50 1.50 N/A N/A N/A N/A 1 1 N/A N/A Decline N N N N Y Y Y Y N N Y Y Test case Conditions 25 26 27 28 29 Smoked? Y N N N Y Diagnosed? N Y N N Y Hospitalized? N N Y N Y Disabled? N N N Y Y BMI 35 36 34 38 37 Age 20 50 70 30 35 Actions Rate 2.625 2.25 1.875 3 10.125 Decline N N N N N
    • 4.2 Specification-Based 153Notice our approach to testing the nonexclusive rules. First, we tested everyrule, exclusive and nonexclusive, in isolation. Then, we tested the remaininguntested boundary values. Next, we tested combinations of only one nonexclu-sive rule with one exclusive rule, making sure each nonexclusive rule had beentested once in combination (but not all the exclusive rules were tested in combi-nation). Finally, we tested a combination of all four nonexclusive rules with oneexclusive rule. We did not use combinations with the “decline” rules since pre-sumably there’s no way to check if the increase was correctly calculated. You might also have noticed that we managed to sneak in covering the min-imum and maximum increases. However, we probably didn’t cover every possi-ble increase. Since we didn’t test every possible pair, triple, and quadruplecombination of rules, we certainly didn’t test every way an increase could be cal-culated by that table. That topic is covered in Advanced Software Testing Vol.1. For the decision table and the boundary values, we’ve captured test coveragein the following tables to make sure we missed nothing. Table 4-23, table 4-24,and table 4-25 show decision table coverage using three coverage metrics.Table 4–23 Conditions 1 2 3 4 5 6 7 8 9 10 11 12 Smoked? Y - - - - - - - - - - - Diagnosed? - Y - - - - - - - - - - Hospitalized? - - Y - - - - - - - - - Disabled? - - - Y - - - - - - - - BMI - - - - 34-36 34-36 34-36 37-39 37-39 37-39 <17 >39 Age - - - - 18-39 40-59 >59 18-39 40-59 >59 - - Actions Increase 50 50 50 50 75 50 25 100 75 50 - - Decline - - - - - - - - - - Y Y Single Rule Coverage Test case(s) 1 2 3 4 5, 6 7, 8 9, 10 11, 12 13, 14 15, 16 17,19 18,20 Pairs of Rules Coverage Test case(s) 25 26 27 28 25 26 27 28 Maximum Combination of Rules Coverage Test case(s) 29 29 29 29 29
    • 154 4 Test Techniques ISTQB Glossary state diagram: A diagram that depicts the states that a component or system can assume and shows the events or circumstances that cause and/or result from a change from one state to another. state transition: A transition between two states of a component or system. state transition testing: A black-box test design technique in which test cases are designed to execute valid and invalid state transitions. Table 4–24 # Partition Boundary Value Test Case 1 Too thin 0 19 2 Too thin 16 17 3 No increase 17 21 4 No increase 33 22 5 Smaller increase 34 5 6 Smaller increase 36 6 7 Bigger increase 37 11 8 Bigger increase 39 12 9 Too heavy 40 18 10 Too heavy max 20 Table 4–25 # Partition Boundary Value Test Case 1 Too young 0 23 2 Too young 17 24 3 Young 18 5 4 Young 39 6 5 Mid-aged 40 7 6 Mid-aged 59 8 7 Senior 60 9 8 Senior max 10 4.2.4 State-Based Testing and State Transition Diagrams We said that after our discussion of equivalence partitioning and boundary value analysis, we would cover two techniques that would prove useful for test- ing business logic, even more useful than equivalence partitioning and bound-
    • 4.2 Specification-Based 155ary value analysis. We covered decision tables, which work very well intransactional situations, in the last section. Now we move on to state-based testing. State-based testing is ideal when wehave sequences of events that occur and conditions that apply to those eventsand the proper handling of a particular event/condition depends on the eventsand conditions that have occurred in the past. In some cases, the sequences ofevents can be potentially infinite, which of course exceeds our testing capabili-ties, but we want to have a test design technique that allows us to handle arbi-trarily long sequences of events. The underlying model is a state transition diagram or table. The diagram ortable connects beginning states, events, and conditions with resulting states andactions. To describe this interaction, consider a system. At a given time, some statusquo prevails and the system is in a steady state. Then some event occurs, someevent that the system must handle. The handling of that event might be influ-enced by one or more conditions. The event/condition combination triggers astate transition, either from the current state to a new state or from the currentstate back to itself again. In the course of the transition, the system takes one ormore actions. Given this model, we generate tests that traverse the states and transitions.The inputs trigger events and create conditions, while the expected results ofthe test are the new states and actions taken by the system. Differing coverage criteria apply for state-based testing. The weakest crite-rion requires that the tests visit every state and traverse every transition. Thiscriterion can be applied to state transition diagrams. A higher coverage crite-rion is at least one test for every row in a state transition table. Achieving “everyrow” coverage will achieve “every state and transition” coverage, which is whywe said it was a higher coverage criterion. Another potentially higher coverage criterion requires that at least one testcover each transition sequence of N or less length. The N can be 1, 2, 3, 4, orhigher. This is called alternatively Chow’s switch coverage—after ProfessorChow, who developed it—or N-1 switch coverage, after the level given to thedegree of coverage. If you cover all transitions of length one, then N-1 switchcoverage means 0 switch coverage. Notice that this is the same as the lowestlevel of coverage discussed. If you cover all transitions of length one and two,
    • 156 4 Test Techniques then N-1 switch coverage means 1 switch coverage. This is a higher level of cov- erage than the lowest level, of course. Now, 1 switch coverage is not necessarily a higher level of coverage than “every-row” coverage. This is because the state transition table forces testing of state and event/condition combinations that do not occur in the state-transition diagram. The so-called “switches” in N-1 switch coverage are derived from the state transition diagram, not the state transition table. All this might be a bit confusing if you’re fuzzy on the test design material covered at the Foundation level. Don’t worry, though; it will be clear to you shortly. So, what is the bug hypothesis in state-based testing? We’re looking for situ- ations where the wrong action or the wrong new state occurs in response to a particular event under a given set of conditions based on the history of event/ condition combinations so far. abandon abandon Transition Final state Event abandon indicator abandon click link/ add to cart/ check out/ login[good]/ display selection login purchase dialog dialog dialog browsing selecting logging in purchasing left continue shopping/ display Initial login[bad]/ purchase[bad]/ purchase[good]/ Action error error confirmation state indicator Condition go State elsewhere confirmed resume shopping/ display Figure 4–19 State transition diagram example Figure 4-19 shows the state transition diagram for shopping and buying items online from an e-commerce application. It shows the interaction of the system with a customer, from the customer’s point of view. Let’s walk through it, and we’ll point out the key elements of state transition diagrams in general and the features of this one in particular.
    • 4.2 Specification-Based 157 First, notice that we have at the leftmost side a small dot-and-arrowelement labeled “initial state indicator.” This notation shows that, from thecustomer’s point of view, the transaction starts when she starts browsing thewebsite. We can click on links and browse the catalog of items, remaining in abrowsing state. Notice the looping arrow above the browsing state. The nodesor bubbles represent states, as shown by the label below the browsing state.The arrows represent transitions, as shown by the label above the loopingarrow. Next, we see that the customer can enter a “selecting” state by adding anitem to the shopping cart; “add to cart” is the event, as shown by the labelabove. The system will display a “selection dialog” where it asks the customer totell us how many of the item she wants, along with any other information weneed to add the item to the cart. Once that’s done, the customer can tell thesystem she wants to continue shopping, in which case the system displays thehome screen again and the customer is back in a browsing state. From anotation point of view, notice that the actions taken by the system are shownunder the event and after the slash symbol, on the transition arrow, as shownby the label below. Alternatively, the customer can choose to check out. At this point, sheenters a logging-in state. She enters login information. A condition applies tothat login information: either it was good or it was bad. If it was bad, the systemdisplays an error and the customer remains in the logging-in state. If it wasgood, the system displays the first screen in the purchasing dialog. Notice thatthe “bad” and “good” shown in brackets are, notationally, conditions. While in the purchasing state, the system will display screens and the cus-tomer will enter payment information. Either that information is good or bad—conditions again—which determines whether we can complete and confirm thetransaction. Once the transaction is confirmed, the customer can either resumeshopping or go somewhere else. Notice also that the user can always abandon the transaction and go else-where. When we talk about state-based testing during live courses we teach, peopleoften ask, “How do we distinguish a state, an event, or an action?” The main dis-tinctions are as follows:
    • 158 4 Test Techniques ■ A state persists until something happens—something external to the thing itself, usually—to trigger a transition. A state can persist for an indefinite period. ■ An event occurs, either instantly or in a limited, finite period. It is the something that happened—the external occurrence—that triggers the transition. Events can be triggered in a variety of ways, such as, for example, by a user with a keyboard or mouse, an external device, or even the operating system. ■ An action is the response the system has during the transition. An action, like an event, is either instantaneous or requires a limited, finite period. Often, an action can be thought of as a side effect of an event. That said, it is sometimes possible to draw the same situation differently, espe- cially when a single state or action can be split into a sequence of finer-grained states, events, and actions. We’ll see an example of that in a moment, splitting the purchase state into substates. Finally, notice that, at the outset, we said this chart is shown from the cus- tomer’s point of view. Notice that, if we drew this from the system’s point of view, it would look different. Maintaining a consistent point of view is critical when drawing these charts, otherwise you’ll end up with a nonsensical diagram. State-based testing uses a formal model, so we can have a formal procedure for deriving tests from the model. Following is a procedure that will work to derive tests that achieve state/transition coverage (i.e., 0 switch coverage): 1. Adopt a rule for where a test procedure or test step must start and where it may or must end. An example is to say that a test step must start in an initial state and may only end in a final state. The reason for the “may” or “must” wording on the ending part is because, in situations where the initial and final states are the same, you might want to allow sequences of states and transitions that pass through the initial state more than once. 2. From an allowed test starting state, define a sequence of event/condition combinations that leads to an allowed test ending state. For each transition that will occur, capture the expected action that the system should take. This is the expected result.
    • 4.2 Specification-Based 1593. As you visit each state and traverse each transition, mark it as covered. The easiest way to do this is to print the state transition diagram and then use a marker to highlight each node and arrow as you cover it.4. Repeat steps 2 and 3 until all states have been visited and all transitions tra- versed. In other words, every node and arrow has been marked with the marker.This procedure will generate logical test cases. To create concrete test cases,you’d have to generate the actual input values and the actual output values. Forthis book, we intend to generate logical tests to illustrate the techniques, butremember, as we mentioned before, at some point before execution, the imple-mentation of concrete test cases must occur. abandon abandon Transition Final state Event abandon indicator abandonclick link/ add to cart/ check out/ login[good]/ display selection login purchase dialog dialog dialog browsing selecting logging in purchasing left continue shopping/ display Initial login[bad]/ purchase[bad]/ purchase[good]/ Action error error confirmation stateindicator Condition go State elsewhere confirmed resume shopping/ displayFigure 4–20 Coverage check 1Let’s apply this process to the example e-commerce application we’ve just lookedat. In figure 4-20, we will redraw the state transition diagram of figure 4-19using dashed lines to indicate states and transitions that have been covered.Here, we see two things. First, we have the rule that says that a test must start in the initial state andmust end in the final state.Next, we generate the first logical test case.
    • 160 4 Test Techniques 1. (browsing, click link, display, add to cart, selection dialog, continue shop- ping, display, add to cart, selection dialog, checkout, login dialog, login[bad], error, login[good], purchase dialog, purchase[bad], error, pur- chase[good], confirmation, resume shopping, display, abandon, left). At this point, we check completeness of coverage, which we’ve been tracking in our state transition diagram. As you can see in figure 4-20, we have covered all of the states and most transitions, but not all of the transitions. We need to cre- ate some more test cases. abandon abandon Transition Final state Event abandon indicator abandon click link/ add to cart/ check out/ login[good]/ display selection login purchase dialog dialog dialog browsing selecting logging in purchasing left continue shopping/ display Initial login[bad]/ purchase[bad]/ purchase[good]/ Action error error confirmation state indicator Condition go State elsewhere confirmed resume shopping/ display Figure 4–21 Coverage check completed Figure 4-21 shows the test coverage achieved by the following additional test cases (case 1 was listed earlier): 2. (browsing, add to cart, selection dialog, abandon, <no action>, left) 3. (browsing, add to cart, selection dialog, checkout, login dialog, abandon, <no action>, left) 4. (browsing, add to cart, selection dialog, checkout, login dialog, login[good], purchase dialog, abandon, <no action>, left) 5. (browsing, add to cart, selection dialog, continue shopping, display, add to cart, selection dialog, checkout, login dialog, login[good], purchase dialog, purchase[good], confirmation, go elsewhere, <no action>, left)
    • 4.2 Specification-Based 161Remember that you’re not done generating tests until every state and everytransition has been highlighted, as shown in figure 4-21. One rule of thumb that can help estimate the number of tests needed to getthis minimum coverage: we usually need as many test cases as there are transi-tions entering the final state. In our example, there are five arrows pointing tothe left state. While this rule of thumb often works, it is not officially part of thetest design technique.4.2.4.1 Superstates and SubstatesIn some cases, it makes sense to unfold a single state into a superstate consistingof two or more substates. In figure 4-22, you can see that we’ve taken the pur-chasing state from the e-commerce example and expanded it into three sub-states. abandon abandon abandon login[good]/ address address[good]/ payment[good]/ Expand dialog payment order dialog purchasing Substates dialog entering specifying editting address payment orderpurchase[bad]/ Substate error address[bad]/ payment[bad]/ order[bad]/ error error error Purchasing superstate purchase[good]/ confirmationFigure 4–22 Superstates and substatesThe rule for basic coverage here follows simply. Cover all transitions into thesuperstate, all transitions out of the superstate, all substates, and all transitionswithin the superstate. Note that, in our example, this would increase the number of tests becausewe now have three “abandon” transitions to the “left” state out of the purchasingsuperstate rather than just one transition from the purchasing state. This wouldalso add a finer-grained element to our tests—i.e., more events and actions—aswell as make sure we tested at least three different types of bad purchasingentries.
    • 162 4 Test Techniques ISTQB Glossary state table: A grid showing the resulting transitions for each state combined with each possible event, showing both valid and invalid transitions. 4.2.4.2 State Transition Tables State transition tables are useful because they force us—and the business ana- lysts and the system designers—to consider combinations of states with event/ condition combinations that they might have forgotten. To construct a state transition table, you first list all the states from the state transition diagram. Next, you list all the event/condition combinations shown on the state transition diagram. Then, you create a table that has a row for each state with every event/condition combination. Each row has four fields: ■ Current state ■ Event/condition ■ Action ■ New state For those rows where the state transition diagram specifies the action and new state for the given combination of current state and event/condition, we can populate those two fields from the state transition diagram. However, for the other rows in the table, we find undefined situations, i.e., situations where the behavior of the system is not specified. We can now go to the business analysts, system designers, and other such people and ask, “So, what exactly should happen in each of these situations?” You might hear them say, “Oh, that can never happen!” As a technical test analyst, you know what that means. Your job now is to figure out how to make it happen. You might hear them say, “Oh, well, I’d never thought of that.” That proba- bly means you just prevented a bug from ever happening, if you are doing test design during system design. As an example, consider figure 4-23. Assume that the customer is in the browsing state, having previously put one or more items into the cart. If she decides to check out while in the browsing state, she cannot: that state-event/ condition combination is not defined. We have found a missing requirement! If
    • 4.2 Specification-Based 163we actually built the system in this way, we would force our customers to onlybe able to check out immediately after putting an item in the cart. Our customernow must put an additional item into the cart to be able to check out. Or, shemight just decide to move to our competitor’s site! A state transition table canbe an invaluable tool when looking for missing requirements, much the sameway a decision table can. Current Event/cond Action New State State Browsing Click link Display Browsing Click link Browsing Add to cart Selection dia Selecting Add to cart Browsing Continue Undefined Undefined shopping Continue shopping Browsing Check out Undefined UndefinedBrowsing Check out Browsing Login[bad] Undefined UndefinedSelecting Login[bad] Browsing Login[good] Undefined UndefinedLogging Login[good] Browsing Purchase[bad] Undefined UndefinedPurchasing Purchase[bad] = Browsing Purchase[good] Undefined UndefinedConfirmed Purchase[good] Browsing Abandon <no action> LeftLeft Abandon Browsing Resume Undefined Undefined shopping Resume shopping Browsing Go elsewhere Undefined Undefined Go elsewhere Selecting Click link Undefined Undefined (Fifty-three rows, generated in the pattern shown above, not shown) Left Go elsewhere Undefined UndefinedFigure 4–23 State transition table exampleFigure 4-23 shows an excerpt of the table we would create for the e-commerceexample we’ve been looking at so far. We have six states:■ Browsing■ Selecting■ Logging in■ Purchasing■ Confirmed■ LeftWe have 11 event/condition combinations:■ Click link■ Add to cart
    • 164 4 Test Techniques ■ Continue shopping ■ Check out ■ Login[bad] ■ Login[good] ■ Purchase[bad] ■ Purchase[good] ■ Abandon ■ Resume shopping ■ Go elsewhere That means our state transition table should have 66 rows, one for each possible pairing of a specific state with a specific event/condition combination. To derive a set of tests that covers the state transition table, we can follow the following procedure. Notice that we build on an existing set of tests created from the state transition diagram to achieve state/transition or 0-switch cover: 1. Start with a set of tests (including the starting and stopping state rule), derived from a state transition diagram, that achieves state/transition cover- age. 2. Construct the state transition table and confirm that the tests cover all the defined rows. If they do not, then either you didn’t generate the existing set of tests properly or you didn’t generate the table properly, or the state transi- tion diagram is screwed up. Do not proceed until you have identified and resolved the problem, including re-creating the state transition table or the set of tests, if necessary. 3. Select a test that visits a state for which one or more undefined rows exists in the table. Modify that test to attempt to introduce the undefined event/ condition combination for that state. Notice that the action in this case is undefined. 4. As you modify the tests, mark the row as covered. The easiest way to do this is to take a printed version of the table and use a marker to highlight each row as covered. 5. Repeat steps 3 and 4 until all rows have been covered. Again, this procedure will generate logical test cases. Eventually, either formally (in a scripted test case) or informally (using experience-based testing), you’ll need to choose data for execution of the tests.
    • 4.2 Specification-Based 165 As an example of deriving state table-based tests, we can build on the e-commerce example already shown. Start with an existing test from the statetransition testing; here we select test 4 from earlier: (browsing, add to cart, selection dialog, checkout, login dialog, login[good], purchase dialog, abandon, <no action>, left)Now, from here we start to create modified tests to cover undefined browsingevent/conditions and those undefined conditions only.One test is as follows: (browsing, attempt: continue shopping, action undefined, add to cart, selec- tion dialog, checkout, login dialog, login[good], purchase dialog, abandon, <no action>, left)Another test: (browsing, attempt: check out, action undefined, add to cart, selection dialog, checkout, login dialog, login[good], purchase dialog, abandon, <no action>, left)There are six other modified tests for browsing, which we’ve not shown. As youcan see, it’s a mechanical process to generate these tests. As long as you are care-ful to keep track of which rows you’ve covered—using the marker trick we men-tioned earlier, for example—it’s almost impossible to forget a test. Now, you’ll notice that we only included one undefined event/conditioncombination in each test step. Why? This is a variant of the equivalence parti-tioning rule that we should not create invalid test cases that combine multipleinvalid event/conditions. In this case, each row corresponds to an invalid event/condition. If we try to cover two rows in a single test step, we can’t be sure thesystem will remain testable after the first invalid event/condition. Notice that we indicated that the action is undefined. What is the ideal sys-tem behavior under these conditions? Well, the best-case scenario is that theundefined event/condition combination is impossible to trigger. If we cannotget there from here—no menu items, no buttons, no hot-key combinations, nopossible URL edits—so much the better. Next best is that the undefined event/condition pair is ignored or—better yet—rejected with an intelligent errormessage. At that point, processing continues normally. In the absence of any
    • 166 4 Test Techniques ISTQB Glossary N-switch testing: A form of state transition testing in which test cases are designed to execute all valid sequences of N+1 transitions. meaningful input from business analysts, the requirements specification, sys- N-switch testing tem designers, or any other authority, we would take the position that any other outcome is a bug, including some inscrutable error message like, “What just happened can’t happen.” (No, we are not making that up; an RBCS course attendee once told Rex she had seen exactly that message when inputting an unexpected value.) 4.2.4.3 Switch Coverage Figure 4-24 shows how we can generate sequences of transitions using the con- cept of switch coverage. We will illustrate this concept with the e-commerce example we’ve used so far. 2 3 9 10 11 1 4 5 A B C D E 12 14 13 6 8 7 F 0-switch 1-switch A1 A2 A9 A1A1 A1A2 A1A9 A9B10 A9B8 A9B3 B10 B8 B3 B10C14 B10C11 B10C4 B8A1 B8A2 B8A9 C14 C11 C4 C14C14 C14C11 C14C4 C11D13 C11D12 C11D5 D13 D12 D5 D13D13 D13D12 D13D5 D12F6 D12F7 F6 F7 F7A1 F7A2 F7A9 Figure 4–24 N-1 switch coverage example At the top of figure 4-24, you see the same state transition as before, except we have replaced the state labels with letters and the transition labels with numbers. Now a state/transition pair can be specified as a letter followed by a number. Notice that we are not bothering to list, in the table, a letter after the number
    • 4.2 Specification-Based 167because it’s unambiguous from the diagram what state we’ll be in after the giventransition. There is only one arrow labeled with a given number that leads out ofa state labeled with a given letter, and that arrow lands on exactly one state. The table contains two types of columns. The first contains the state/transi-tion pairs that we must cover to achieve 0-switch coverage. Study this for amoment, and assure yourself that, by designing tests that cover each state/tran-sition pair in the 0-switch columns, you’ll achieve state/transition coverage asdiscussed previously. Constructing the 0-switch columns is easy. The first row consists of the firststate, with a column for each transition leaving that state. There are at mostthree transitions from the A state. Repeat that process for each state for whichthere is an outbound transition. Notice that the E state doesn’t have a row, andthat’s because E is a final state and there’s no outbound transition. Notice alsothat, for this example, there are at most three transitions from any given state. The 1-switch columns are a little trickier to construct, but there’s a regular-ity here that makes it mechanical if you are meticulous. Notice, again that aftereach transition occurs in the 0-switch situation, we are left in a state which isimplicit in the 0-switch cells. As mentioned, there are at most three transitionsfrom any given state. So that means that, for this example, each 0-switch cell canexpand to at most three 1-switch cells. So, we can take each 0-switch cell for the A row and copy it into three cellsin the 1-switch columns, for nine cells for the A row. Now, we ask ourselves, foreach triple of cells in the A row of the 1-switch columns, what implicit state didwe end up in? We can then refer to the appropriate 0-switch cells to populate theremainder of the 1-switch cell. Notice that the blank cells in the 1-switch columns indicate situationswhere we entered a state in the first transition from which there was no out-bound transition. In figure 4-24, that is the state labeled E, which was labeled“Left” on the full-sized diagram. So, given a set of state/transition sequences like those shown—whether 0-switch, 1-switch, 2-switch, or even higher—how do we derive test cases to coverthose sequences and achieve the desired level of coverage? Again, we’re going tobuild on an existing set of tests created from the state transition diagram toachieve state/transition or 0-switch coverage.
    • 168 4 Test Techniques 1. Start with a set of tests (including the starting and stopping state rule), derived from a state transition diagram, that achieves state/transition cover- age. 2. Construct the switch table using the technique shown previously. Once you have, confirm that the tests cover all of the cells in the 0-switch columns. If they do not, then either you didn’t generate the existing set of tests properly or you didn’t generate the switch table properly, or the state transition dia- gram is wrong. Do not proceed until you have identified and resolved the problem, including re-creating the switch table or the set of tests, if neces- sary. Once you have that done, check for higher-order switches already cov- ered by the tests. 3. Now, using 0-switch sequences as needed, construct a test that reaches a state from which an uncovered higher-order switch sequence originates. Include that switch sequence in the test. Check to see what state this left you in. Ideally, another uncovered higher-order switch sequence originates from this state, but if not, see if you can use 0-switch sequences to reach such a state. You’re crawling around in the state transition diagram looking for ways to cover higher-order sequence. Repeat this for the current test until the test must terminate. 4. As you construct tests, mark the switch sequences as covered once you include them in a test. The easiest way to do this is to take a printed version of the switch table and use a marker to highlight each cell as covered. 5. Repeat steps 3 and 4 until all switch sequences have been covered. Again, this procedure will generate logical test cases. 2 3 9 10 11 1 4 5 A B C D E 12 14 13 6 8 7 F 0-switch 1-switch A1 A2 A9 A1A1 A1A2 A1A9 A9B10 A9B8 A9B3 B10 B8 B3 B10C14 B10C11 B10C4 B8A1 B8A2 B8A9 C14 C11 C4 C14C14 C14C11 C14C4 C11D13 C11D12 C11D5 D13 D12 D5 D13D13 D13D12 D13D5 D12F6 D12F7 F6 F7 F7A1 F7A2 F7A9 Figure 4–25 Deriving tests example
    • 4.2 Specification-Based 169In figure 4-25, we see the application of the derivation technique covered in fig-ure 4-24 to the e-commerce example we’ve used. After finishing the second step,that of assessing coverage already attained via 0-switch coverage, we can see thatmost of the table is already shaded. Those are the lighter-shaded cells, which arecovered by the five existing state/transition cover tests. Now, we generate five new tests to achieve 1-switch coverage. Those areshown below. The darker-shaded cells are covered by five new 1-switch covertests.1. (A1A1A2).2. (A9B8A1A9B8A2).3. (A9B10C14C14C4).4. (A9B10C11D13D13D5).5. (A9B10C11D12F7A1A9B10C11D12F7A9).We need to mention something about this algorithm for deriving higher-orderswitch coverage tests, as well as the one given previously for row-coverage tests.Both build on an existing set of tests that achieve state/transition coverage. Thatis efficient from a test design point of view. It’s also conservative from a test exe-cution point of view because we cover the less challenging stuff first and thenmove on to the more difficult tests. However, it is quite possible that, starting from scratch, a smaller set of testscould be built, both for the row coverage situation and for the 1-switch coveragesituation. If the most important thing is to create the minimum number of tests,then you should look for ways to reduce the tests created, or modify the deriva-tion procedures given here to start from scratch rather than to build on an exist-ing set of 0–switch tests.4.2.4.4 State Testing with Other TechniquesLet’s finish our discussion of state-based testing by looking at a couple of inter-esting questions. First, how might equivalence partitioning and boundary valueanalysis combine with state-based testing? The answer is, quite well. From the e-commerce example, suppose that the minimum purchase is $10and the maximum is $10,000. In that case, we can perform boundary valueanalysis as shown in figure 4-26, performed on the purchase[good] and pur-chase[bad] event/condition combinations. By covering not only transitions,
    • 170 4 Test Techniques rows, and transition sequences, but also boundary values, this forces us to try different purchase amounts. pay[good] Equivalence American MasterCard Visa Discover Partitioning Express Invalid Valid Invalid purchase (zero) (in range) (too large) [bad] Invalid Invalid Boundary (negative) (too low) Value Analysis -max -0.01 0 0.01 9.99 10 10,000 10,000.01 max purchase [good] Figure 4–26 Equivalence partitions and boundary values We can also apply equivalence partitioning to the pay[good] event/condition combination. For example, suppose we accept four different types of credit cards. By covering not only transitions, rows, and transition sequences, but also equivalence partitions, this forces us to try different payment types. Now, to come full circle on a question we brought up at the start of the dis- cussion on these two business-logic test techniques. When do we use decision tables and when do we use state diagrams? This can be, in some cases, a matter of taste. The decision table is compact. If we’re not too worried about the higher-order coverage, or the effect of states on the tests, many state-influenced situations can be modeled as decision tables, using conditions to model states. However, if the decision table’s conditions sec- tion starts to become very long, you’re probably stretching the technique. Also, keep in mind that test coverage is usually more thorough using state-based techniques. In most cases, one technique or the other will clearly fit better. If you are at a loss, try both and see which feels most appropriate. 4.2.4.5 State Testing Exercise This exercise consists of three parts: 1. Using the following semiformal use case, translate it into a state transition diagram, shown from the point of view of the Telephone Banker. 2. Generate test cases to cover the states and transitions (0-switch). 3. Generate a switch table to the 1-switch level.
    • 4.2 Specification-Based 171Table 4–26 Actor Telephone Banker Preconditions The Globobank Telephone Banker is logged into the HELLOCARMS System. Normal 1. The Telephone Banker receives a phone call from a Customer. Workflow 2. The Telephone Banker interviews the Customer, entering information into the HELLOCARMS System through a web browser interface on his Desktop. 3. Once the Telephone Banker has gathered the information from the Customer, the HELLOCARMS System determines the credit-worthiness of the Customer using the Scoring Mainframe. 4. Based on all of the Customer information, the HELLOCARMS System displays various Home Equity Products that the Telephone Banker can offer to the Customer. 5. If the Customer chooses one of these Products, the Telephone Banker will conditionally confirm the Product. 6. The interview ends. The Telephone Banker directs the HELLOCARMS System to transmit the loan information to the Loan Document Printing System (LoDoPS) in the Los Angeles Datacenter for origination. Exception During step 2 of the normal workflow, if the Customer is requesting a large loan or Workflow 1 borrowing against a high-value property, the Telephone Banker escalates the application to a Senior Telephone Banker who decides whether to proceed with the application. If the decision is to proceed, then the Telephone Banker completes the remainder of step 2 and proceeds normally. If the decision is not to proceed, the Telephone Banker informs the Customer that the application is declined and the interview ends. Exception During step 4 of the normal workflow, if the System does not display any Home Equity Workflow 2 Products as available, the Telephone Banker informs the Customer that the application is declined and the interview ends. Exception During step 5 of the normal workflow, if the product chosen by the Customer was a Workflow 3 Home Equity Loan, the Telephone Banker offers the Customer the option of applying for life insurance to cover the loan. If the Customer wants to apply, the following steps occur: 1. The Telephone Banker interviews the Customer, entering health information into the HELLOCARMS System through a web browser interface on his Desktop. 2. The HELLOCARMS System processes the information as described in the previous exercise. One of two outcomes will occur: a. The HELLOCARMS System declines to offer insurance based on the health information given. The Telephone Banker informs the Customer that the insurance application was denied. This exception workflow is over and processing returns to step 5. b. The HELLOCARMS System offers insurance at a rate based on the loan size and the health information given. The Telephone Banker informs the Customer of the offer. 3. The Customer makes one of two decisions: a. Accept the offer. The Telephone Banker makes the life insurance purchase part of the overall application. This exception workflow is over and processing returns to step 5. b. Reject the offer. The Telephone Banker excludes the life insurance purchase from the overall application. This exception workflow is over and processing returns to step 5.
    • 172 4 Test Techniques Exception During any of steps 1 through 5 of the normal workflow, if the Customer chooses to Workflow 4 end the interview without continuing the process or selecting a product, the application is cancelled and the interview ends. Exception If no Telephone Banker is logged into the system (e.g., because the system is down) Workflow 5 and step 1 of the normal workflow begins, the following steps occur: 1. The Telephone Banker takes the information manually. At the end of the interview, the Telephone Banker informs the Customer that a Telephone Banker will call back shortly with the decision on the application. 2. Once a Telephone Banker is logged into the System, the application information is entered into HELLOCARMS and normal processing resumes at step 2. 3. The Telephone Banker calls the Customer once one of the following outcomes has occurred: a. Step 5 of normal processing is reached. Processing continues at step 5. b. At step 2 of normal processing, exception workflow 1 was triggered. Processing continues at step 2. c. At step 4 of normal processing, exception workflow 2 was triggered. No processing remains to be done. Postconditions Loan application is in LoDoPS system for origination 4.2.4.6 State Testing Exercise Debrief 1. Create state transition diagram. Figure 4-27 shows the state transition diagram we generated based on the pre- ceding semiformal use case. cust cancel/ archive system loan decline/ archive gathering escalating insurance info cust cancel/ archive cust ins accept/ exceed[value or loan]/ approved/ add to package escalate resume offer insurance/ cust ins reject/ insurance screens archive phone call/ systems ins reject/ loan screens system loan offer/ archive gathering offer screen waiting loan info offering cust cancel/ archive system loan decline/ archive cust loan accept/ send to LoDoPS cust loan reject/ archive cust cancel/ shift over archive end of shift/ log out Figure 4–27 HELLOCARMS state transition diagram
    • 4.2 Specification-Based 1732. Generate test cases to cover the states and transitions (0-switch).Let’s adopt a rule that says that any test must start in the initial waiting state andmay only end in the waiting state or the shift over state. To achieve state andtransition coverage, the following tests will suffice:1. (waiting, phone call, loan screens, exceed[value], escalate, approved, resume, system loan offer, offer screen, offer insurance, insurance screens, cust ins accept, add to package, cust loan accept, send to LoDoPS, waiting)2. (waiting, phone call, loan screens, exceed[loan], escalate, approved, resume, system loan offer, offer screen, offer insurance, insurance screens, cust ins reject, archive, cust loan accept, send to LoDoPS, waiting)3. (waiting, phone call, loan screens, system loan offer, offer screen, offer insurance, insurance screens, system ins reject, archive, cust loan reject, archive, waiting)4. (waiting, phone call, loan screens, exceed[loan], escalate, system loan decline, archive, waiting)5. (waiting, phone call, loan screens, system loan decline, archive, waiting)6. (waiting, phone call, loan screens, cust cancel, archive, waiting)7. (waiting, phone call, loan screens, exceed[loan], escalate, cust cancel, archive, waiting)8. (waiting, phone call, loan screens, system loan offer, offer screen, cust can- cel, archive, waiting)9. (waiting, phone call, loan screens, system loan offer, offer screen, offer insurance, insurance screens, cust cancel, archive, waiting)10. (waiting, end of shift, log out, shift over)Notice that we didn’t do an explicit boundary value or equivalence partitioningtesting of, say, the loan amount or the property value, though we certainly couldhave. Also, note that this is an example of when our rule of thumb for number ofneeded test cases did not work. This usually happens because there are multiplepaths between two non-final states (in this case between offering and gatheringinsurance info) that must be tested with separate test cases.3. Generate a switch table to the 1-switch.First, redraw the state transition diagram of figure 4-27 to make it easier to workwith. It should look like figure 4-28.
    • 174 4 Test Techniques 5 B C 4 3 2 7 8 9 10 11 1 A 17 F D 16 15 14 13 12 E Figure 4–28 HELLOCARMS state transition diagram for switch table From the diagram, we can generate the 1-switch table shown in table 4-27. Notice that we have used patterns in the diagram to generate the table. For example, the maximum number of outbound transitions for any state in the diagram is four, so we use four columns on both the 0-switch and 1-switch columns. We started with six 1-switch rows per 0-switch row because there are six states, though we were able to delete most of those rows as we went along. This leads to a sparse table, but who cares as long as it makes generating this beast easier. Table 4–27 0-switch 1-switch A1 A12 A1F2 A1F6 A1F16 A1F17 B3 B4 B7 B3A1 B3A12 B4A1 B4A12 B7F2 B7F6 B7F16 B7F17 C5 C9 C10 C11 C5A1 C5A12 C9D8 C9D13 C9D14 C9D15 C10D8 C10D13 C10D14 C10D15 C11D8 C11D13 C11D14 C11D15 D8 D13 D14 D15 D8C5 D8C9 D8C10 D8C11 D13A1 D13A12 D14A1 D14A12 D15A1 D15A12 F2 F6 F16 F17 F2B3 F2B4 F2B7 F6D8 F6D13 F6D14 F6D15 F16A1 F16A12 F17A1 F17A12
    • 4.2 Specification-Based 1754.2.5 Requirements-Based Testing ExerciseThis exercise requires you to select which specification-based techniques to useto test requirement element 040-020-030 from the HELLOCARMS systemRequirements Document.The exercise consists of two parts:1. Select appropriate techniques for test design.2. Apply those techniques to generate tests, achieving the coverage criteria for the techniques.As always, check your work on the first part before proceeding to the secondpart. The solutions are shown in the following section.4.2.6 Requirements-Based Testing Exercise DebriefThe requirement in question reads as follows: Load App Server to no more than 30% CPU and 30% Resource utilization average rate with peak utilization never more than 80% when handling 4,000 simultaneous (concurrent) application submissions.We have discussed four specification-based techniques in this section: equiva-lence partitioning, boundary value analysis, decision tables, and state transitiontesting. The former two will be very useful in being able to test this requirement;however, no clear value of using decision tables or state transition for this exam-ple comes immediately to mind. The following analysis will be done with the requirement as it stands. Whilethe CPU requirement is clear, the resource utilization is not. Which resources?Since there are dozens of resources that we can monitor, it is not really clear.This is often the case when we get requirements documents that have not beensubjected to rigorous static review. For this exercise, we have decided to pick a few main resources: the physicaldisk, memory utilization, page file usage, and network utilization as a percent-age of bandwidth. We might expect (hope?) that we would get additional clarityvia static review of the requirements. One more clarification. On the resources we are looking at, we do notbelieve that any of them, other than CPU, can be run at a full 100 percent. Foreach one, there will be a maximum utilization somewhat below 100 percent. We
    • 176 4 Test Techniques would expect the modeling portion of the performance test would establish exactly what the reasonable top end will be. For our equivalence class identifica- tion and boundary value analysis, we will call that top end 100 percent and manage our peak utilization to 80 percent of that value. For CPU, 80 percent would actually mean 80 percent usage. Based on the requirement, all of the measurements would have the same equivalent classes and boundary values as shown in figure 4-29 and figure 4-30. Instantaneous measurements : Average Sometimes Never 0% 30% 80% 100% Figure 4–29 Instantaneous measurement boundaries Average measurements: Always Never 0% 30% 100% Figure 4–30 Average measurement boundaries For instantaneous measurements, the valid high boundary would be 80 percent. The invalid boundary would be just over 80 percent. Epsilon would be depen- dent on the quality of our measuring tools (for this exercise, we will assume an epsilon of zero). For average measurements, over the life of the test, 30 percent would be the valid high boundary. Invalid high boundary would be just over 30 percent. Exact values are going to depend on the instrumentation used. Testing would consist of modeling the system, the test environment, and other setup tasks consistent with the techniques discussed in chapters 5 and 9. We would start executing the test via our performance tool by applying vir- tual users to the system slowly, making sure that functionality is working as expected. Monitoring tools would be used to measure instantaneous values and store them for later calculation of averages. The load would consist of virtual users performing all acceptable operations, modeling the real world usage of the system. Over a defined time, we would increase loading until the system is clocking at the rated value of concurrent users. All of this presupposes that we have hard-
    • 4.3 Structure-Based 177 ISTQB Glossary structure-based design technique (white-box test design technique): Proce- dure to derive and/or select test cases based on an analysis of the internal structure of a component or system. structure-based design technique (white-box testware equivalent to that of production. This assumption is often wrong in our design technique)experience. If the hardware that we are testing on is appreciably smaller thanproduction is expected to be, we would have to use extrapolation to try to deter-mine what the actual scaled values the test should return. 74.3 Structure-Based Learning objectives (K2) List examples of typical defects to be identified by each specific structure-based7 technique. (K3) Write test cases in real-life using the following test design techniques (the tests shall achieve a given model coverage). – Statement testing – Decision testing – Condition determination testing – Multiple condition testing (K4) Analyze a system in order to determine which structure-based technique to apply for specific test objectives. (K2) Understand each structure-based technique and its corresponding coverage criteria and when to use it. (K4) Be able to compare and analyze which structure-based technique to use in different situations.Structural-based testing uses the internal structure of the system as a test basisfor deriving dynamic test cases. In other words, we are going to use informationabout how the system is designed and built to derive our tests.7. In the ISTQB Advanced Syllabus, 2007, the learning objective is actually incorrect. It says“specific specification-based technique” but clearly means “structure-based”.
    • 178 4 Test Techniques The question that should come to mind is why. We have all kinds of specifi- cation-based (black-box) testing methods to choose from. We just spent dozens of pages going through four of the many different black-box design methods. Why do we need more? We don’t have time or resources to spare for extra test- ing, do we? Well, consider a world-class, excellent system test team using all black-box and experience-based techniques. Suppose they go through all of their testing, using decision tables, state-based tests, boundary analysis, and equivalence classes. They do exploratory and attack-based testing and use error guessing and checklist-based methods (which we will discuss later). At the end of that, have they done enough testing? Maybe. But research has shown that even with all of that testing, and all of that effort, they may have missed a few things. Maybe as much as 70 percent of all of the code in the system might never have been executed once! Not once! How can that be? Well, a good system is going to have a lot of code that is only there to handle the unusual, exceptional conditions that may occur. The happy path is often fairly straightforward to build—and test. And, if everyone were an expert, and no one ever made mistakes, and everyone followed the happy path without deviation, we would not need to worry so much about test- ing the rest. If systems did not sometimes go down, and networks did not some- times fail, and databases didn’t get busy and stuff didn’t happen... But, unfortunately, many people are novices at using software and even experts forget things. And people do make mistakes and multi-strike the keys and look away at the wrong time. And virtually no one follows only the happy path without stepping off it occasionally. Stuff happens. And the software must be written so that when stuff happens, it does not roll over and die. To handle these less-likely conditions, developers design systems and write code to survive the bad stuff. That makes systems convoluted and complex. Levels of complexity are placed on top of levels of complexity; the resulting sys- tem is usually hard to test well. We have to be able to look inside so we can test all of those levels. In addition, black-box testing is predicated on having models which expose the behaviors and list all requirements. Unfortunately, no matter how complete, not all behaviors and requirements are going to be visible to the testers. Requirements are often changed on the fly, features added, changed, or
    • 4.3 Structure-Based 179removed. Functionality often requires the developers to build “helper” func-tionality to be able to deliver. Internal data flows often occur between hiddendevices which have asynchronous timing triggers, invisible to black-box testers. Much of white-box testing is involved with coverage—making sure that wehave tested everything that we can based on the context of project needs. Usingwhite-box testing on top of black-box testing allows us to measure the coveragewe got and add more testing when needed to make sure we have tested all of thestuff we wanted to. In the following sections, we are going to discuss how todesign, create, and execute white-box testing.4.3.1 Control-Flow TestingOur first step into structural testing will be to discuss a technique called con-trol-flow testing. Control-flow testing is done through control-flow graphs, away of abstracting a code module in order to better understand what it does.Control-flow graphs give us a visual representation of the structure of the code.The algorithm for all control-flow testing consists of converting a section of thecode into a control graph and then analyzing the possible paths through thegraph. There are a variety of techniques that we can apply to decide just howthoroughly we want to test the code. Then we can create test cases to test to thatchosen level. If there are different levels of control-flow testing we can choose, we need tocome up with a differentiator that helps us decide which level of coverage tochoose. How much should we test? Possible answers range from no testing at all(a complete laissez-faire approach) to total exhaustive testing, hitting every pos-sible path through the software. The end points are actually a little silly; no oneis going to build a system and send it out without running it at least once. At theother end of the spectrum, exhaustive testing would require an infinite amountof time and resources. In the middle, however, there is a wide range of coveragesthat are possible. We will look at different reasons for testing to some of these different levelsof coverage later. In this section, we just want to discuss what these levels ofcontrol-flow coverage are named. At the low end of control-flow testing, we have statement coverage.Synonyms that are used for this include instruction and code coverage; each onemeans the same thing: Have we exercised, at one time or another, every single
    • 180 4 Test Techniques line of code in the system? That would give us 100 percent statement coverage. It is possible to test less than that; people not doing white-box testing do it all the time; they just don’t measure it. Remember, thorough black-box testing without doing any white-box testing may total less than 30 percent statement coverage. The next step up in control-flow testing is called decision (or branch) cover- age. This is determined by the total number of decisions in the code that we have exercised, both ways. We will honor the ISTQB decision to treat branch and decision testing as synonyms. There are very slight differences between the two, but those differences are insignificant at the level we will examine them.8 Then we have condition coverage where we ensure that we evaluate each condition that makes up a decision at least once and multiple-condition coverage where we test all possible combinations of outcomes for individual conditions inside all decisions. The ISTQB Advanced syllabus lists a level of coverage called condition determination (we will use the term decision/condition coverage) where we test all combinations of outcomes for individual conditions that can affect a decision outcome. Closely related to that is a mouthful we call multiple condition/decision coverage (also known as MC/DC). We will add a level of coverage called loop coverage, not discussed by ISTQB, but we think it’s interesting anyway. Then we will look at various path coverage testing schemes, including one called Linear Code Sequence and Jump (LCSAJ) coverage. If this sounds complex, well, it is a bit. In the upcoming pages, we will explain what all of these terms and all the concepts mean. It is not as confusing as it might be—we just need to take it one step at a time and all will be clear. Each technique essentially builds on the shortcoming of the previous technique. 4.3.1.1 Building Control-Flow Graphs Before we can discuss control-flow testing, we must define how to create a con- trol-flow graph. In the next few paragraphs, we will discuss the individual pieces that make up control-flow graphs. 8. The United States Federal Aviation Administration makes a distinction between branch coverage and decision coverage with branch coverage deemed weaker. If you are interested in this distinction, see Software Verification Tools Assessment Study, FAA, June 2007.
    • 4.3 Structure-Based 181Figure 4–31 The process blockIn figure 4-31, we see the process block. Graphically, it consists of a node (bub-ble or circle) with one path leading to it and one path leading from it. Essen-tially, this represents a chunk of code that executes sequentially—that is, nodecisions are made inside of it. The flow of execution reaches the process block,executes through that block of code in exactly the same way each time, and thenexits, going elsewhere. This concept is essential to understanding control-flow testing. Decisionsare the most important part of the control-flow concept; individual lines of codewhere no decisions are made do not affect the control-flow and thus can beignored. The process block has no decisions made inside it. Whether the pro-cess block has one line or a million lines of code, we only need one test to exe-cute it completely. The first line of code executes, the second executes, thethird... right up to the millionth line of code. There is no deviation no matterhow many different test cases are run. Entry is at the top, exit is at the bottom,and every line of code executes every time.Figure 4–32 The junction pointThe second structure we need to discuss, seen in figure 4-32, is called a junctionpoint. This structure may have any number of different paths leading into the pro-cess block with only one path leading out. No matter how many different paths wehave throughout a module, eventually they must converge. Again, no decisions aremade in this block; the indicated roads lead to it with only one road out.Figure 4–33 Two decision points
    • 182 4 Test Techniques A decision point is very important; indeed, it’s key to the concept of control flow. A decision point is represented as a node with one input and two or more possible outputs. Its name describes the action inside the node: A decision as to which way to go is made and control-flow continues out that path while ignor- ing all of the other possible choices. In figure 4-33, we see two decision points: one with two outputs, one with five outputs. How is the choice of output path made? Each programming language has a number of different ways of making decisions. In each case, a logical decision, based on comparing specific values, is made. We will discuss these later; for now, it is sufficient to say a decision is made and control-flow continues one way and not others. Note that these decision points force us to have multiple tests, at least one test for each way to make the decision differently, changing the way we traverse the code. In a very real way, it is the decisions that a computer can make that make it interesting, useful, and complex to test. The next step is to combine these three relatively simple structures into use- ful control-flow graphs. Lines 4,5,6 1 #include <stdio.h> 2 main () Lines 7,10 3 { if 4 int i, n, f; 5 printf (“n = “); 6 scanf (“%d”, &n); 7 if (n < 0) { 8 printf (“Invalid: %dn”, n); 9 n = -1; Lines 8,9 Line 11 10 } else { 11 f = 1; 12 for (i = 1; i <= n; i++) { 13 f *= i; 14 } 15 printf(“d! = %dn”, n, f); Line 12 for Line 13 16 } 17 return n; 18 } Line 17 Line 15 Figure 4–34 A simple control-flow graph In figure 4-34, we have a small module of code. Written in C, this module con- sists of a routine to calculate a factorial. Just to refresh your memory, a factorial
    • 4.3 Structure-Based 183is the product of all positive integers less than or equal to n and is designatedmathematically as n!. 0! is a special case that is explicitly defined to be equal toone. The following are the first three factorials: 1! = 1 2! = 1 * 2 ==> 2 3! = 1 * 2 * 3 ==> 6 etc.To create a control-flow diagram from this code, we do the following:1. The top process block contains up to and including line 6. Note that there are no decisions, so all lines go into the same process block. By combining multiple lines of code where there is no decision made into a single process block, we can simplify the graph. Note that we could conceivably have drawn a separate process block for each line of code.2. At line 7 there is a reserved word in the C language: if. This denotes that the code is going to make a decision. Note that the decision can go one of two ways, but not both. If the decision resolves to TRUE, the left branch is taken and lines 8 and 9 are executed and then the thread jumps to line 17.3. On the other hand, if the decision resolves to FALSE, the thread jumps to line 10 to execute the else clause.4. Line 11 sets a value, and then goes to line 12 where another decision is made. This is the reserved word, for, which means we may or may not loop.5. At line 12, a decision is made in the for loop body using the second phrase in the statement (i <= n). If this evaluates to TRUE, the loop will fire, caus- ing the code to go to line 13 where it calculates the value of f and then goes right back to the loop at line 12.6. If the for loop evaluates to FALSE, the thread goes to line 15 and thence to line 17.7. Once the thread gets to line 17, the function ends at line 18.To review the control-flow graph: There are six process blocks, two decisionblocks (lines 7 and 12), and two junction points (lines 12 and 17).4.3.1.2 Statement CoverageNow that we have discussed control-flow graphing, let’s take a look at the firstlevel of coverage we mentioned earlier, statement coverage (also called instruc-tion or code coverage). The concept is relatively easy to understand: Executable
    • 184 4 Test Techniques ISTQB Glossary statement testing: A white-box test design technique in which test cases are designed to execute statements. statement testing statements are the basis for test design selection. To achieve statement coverage, we pick test data that force the thread of execution to go through each line of code that the system contains. To calculate the current level of coverage that we have attained, we divide the number of code statements that are executed by the number of code state- ments in the entire system; if the quotient is equal to one, we have statement coverage. The bug hypothesis is pretty much as expected; bugs can lurk in code that has not been executed. Statement-level coverage is considered the least effective of all control-flow techniques. To reach this modest level of coverage, a tester must come up with enough test cases to force every line to execute at least once. While this does not sound too onerous, it must be done purposefully. One good practice for an advanced technical test analyst is to make sure that the developers they work with are familiar with the concepts we are discussing here. Achieving (at least) statement coverage should be done while unit testing. There is incontrovertibly no guarantee that even college-educated develop- ers learned how important this coverage level is. Jamie has an undergraduate computer science major and a master’s degree in computer science from reputa- ble colleges and yet was never exposed to any of this information until after graduation when he started reading test books and attending conferences. Rex can attest that the concepts of white-box coverage—indeed, test coverage in general—were not discussed when he got his degree in computer science and engineering at UCLA. IEEE, in the standard ANSI 87B (1987), stated that statement coverage was the minimum level of coverage that should be acceptable. Boris Beizer in his seminal work, Software Testing Techniques, has a slightly more inflammatory take on it, “...testing less than this for new software is unconscionable and should be criminalized.” Also from that book, Beizer lays out some rules of common sense:
    • 4.3 Structure-Based 1851. Not testing a piece of code leaves a residue of bugs in the program in pro- portion to the size of the untested code and the probability of bugs.2. The high probability paths are always thoroughly tested if only to demon- strate that the system works properly. If you have to leave some code untested at the unit level, it is more rational to leave the normal, high-prob- ability paths untested, because someone else is sure to exercise them during integration testing or system testing.3. Logic errors and fuzzy thinking are inversely proportional to the probability of the path’s execution.4. The subjective probability of executing a path as seen by the routine’s designer and its objective execution probability are far apart. Only analysis can reveal the probability of a path, and most programmers’ intuition with regard to path probabilities is miserable.5. The subjective evaluation of the importance of a code segment as judged by its programmer is biased by aesthetic sense, ego, and familiarity. Elegant code might be heavily tested to demonstrate its elegance or to defend the concept, whereas straightforward code might be given cursory testing because “How could anything go wrong with that?”Have we convinced you that this is important? Let’s go and look at how to do it. Lines 4,5,6 1 #include <stdio.h> 2 main () Lines 7,10 3 { if 4 int i, n, f; 5 printf (“n = “); 6 scanf (“%d”, &n); 7 if (n < 0) { 8 printf (“Invalid: %dn”, n); 9 n = -1; Lines 8,9 Line 11 10 } else { 11 f = 1; 12 for (i = 1; i <= n; i++) { 13 f *= i; 14 } 15 printf(“d! = %dn”, n, f); Line 12 for Line 13 16 } 17 return n; 18 } Test Values: Line 17 Line 15 n < 0 (gray arrows) n > 0 (black arrows)Figure 4–35 Statement coverage example
    • 186 4 Test Techniques Looking at the same code and control-flow graph we looked at before, fig- ure 4-35 shows the execution of our factorial function when an input value less than zero is tested. The gray arrows show the path taken through the code with this test. Note that we have covered some of the code, but not all. Based on the graph, we still have a way to go to reach this minimum cover- age. Following the black arrows, we see the result of testing with a value greater than 0. At this point, the if in line 7 resolves to FALSE and so we go down the untested branch. Because the input value is at least one, the loop will fire at least once (more times if the input is greater than one). When i increments to larger than n, the loop will stop and the thread of execution will fall through to line 15, line 17, and out. Note that you do not really need the graph to calculate this coverage; you could simply look at the code and see the same thing. However, for many peo- ple, it is much easier to read the graph than the code. At this point, we have achieved statement coverage. If you are unsure, go back and look at figure 4-35 again. Every node (and line of code) has been cov- ered by at least one arrow. We have run two test cases (n<0, n>0). At this point, we might ask ourselves if we are done testing. If we were to do more testing than we need, we are wasting resources and time. While it might be wonderful to always test enough that there are no more bugs possible, it would (even if it were possible) make software so expensive that no one could afford it. Tools are available to determine if your testing has achieved statement cov- erage. We will discuss those in chapter 9. The real question we should ask is, “Did we test enough for the context of our project?” And the answer, of course, is, “It depends on the project.” Doing less testing than necessary will increase the risk of really bad things happening to the system in production. Every test group walks a balance beam in making the “do we need more testing?” decision. Maybe we need to see where more coverage might come from and why we might need it. Maybe we can do a little more testing and get a lot better cover- age. Figure 4-36 shows a really simple piece of code and a matching control-flow graph. Test case 1 is represented by the gray arrows; the result of running this single test with the given values is full 100 percent statement coverage. Notice
    • 4.3 Structure-Based 187on line 2, we have an if statement that compares a to b. Since the inputted values(a = 3, b = 2), when plugged into this decision, will evaluate to TRUE, the linez = 12 will be executed, and the code will fall out of the if statement to line 4where the final line of code will execute and set the variable Rep to 6 by dividing72 by 12. 1 z = 0; 2 if (a > b) then 3 z = 12; if 4 Rep = 72/z; Test 1: Gray arrows a = 3, b = 2 Rep = 6 Test 2: Black arrows a =2, b = 3 Rep = ?Figure 4–36 Where statement coverage failsA few moments ago, we asked if statement coverage was enough testing. Here,we can see a potential problem with stopping at only statement coverage.Notice that, if we pass in the values shown as test 2 (a = 2, b = 3), we have a dif-ferent decision made at the conditional on line 2. In this case, a is not greaterthan b (i.e., the decision resolves to FALSE) and line 3 is not executed. No bigdeal, right? We still fall out of the if to line 4. There, the calculation 72/z is per-formed, exactly the way we would expect. However, there is a nasty little sur-prise at this point. Since z was not reset to 12 inside the branch, it remained setto 0. Therefore, expanding the calculation performed on line 4, we have 72divided by 0.Oops!You might remember from elementary school that anything divided by 0 is notdefined. Seriously, in our mathematics, it is simply not allowed. So what shouldthe computer do at this point? If there is an exception handler somewhere in theexecution stack, it should fire, unwinding all of the calculations that were madesince it was set. If there is no exception handler, the system may crash—hard!The processor that our system is running on likely has a “hard stop” built intoits micro-code for this purpose. In reality, most operating systems are not going
    • 188 4 Test Techniques ISTQB Glossary branch testing: A white-box test design technique in which test cases are designed to execute branches. ISTQB deems this identical to decision testing. decision testing: A white-box test design technique in which test cases are designed to execute decision outcomes. ISTQB deems this identical to branch branch testing testing. decision testing to let the CPU crash with a divide by zero failure—but the OS will likely make the offending application disappear like a bad dream. “But,” you might say, “we had statement coverage when we tested! This just isn’t fair!” As we noted earlier, statement coverage by itself is a bare mini- mum coverage level which is almost guaranteed to miss defects. We need to look at another, higher level of coverage that can catch this particular kind of defect. 4.3.1.3 Decision Coverage The next strongest level of structural coverage is called decision (or branch) coverage. Rather than looking at individual statements, this level of coverage looks at the decisions themselves. Every decision has the possibility of being resolved as either TRUE or FALSE. No other possibilities: binary results, TRUE or FALSE. For those who point out that the switch statement can make more than two decisions: well, conceptually that seems to be true. The switch statement is a complex set of decisions, often built as a table by the compiler. The generated code, however, is really just a number of binary compares which continue sequentially until either a match is found or the default condition is reached. Each atomic decision in the switch statement is still a comparison between two values that evaluates either TRUE or FALSE. To get to the decision level of coverage, every decision made by the code must be tested both ways, TRUE and FALSE. That means—at minimum—two test cases must be run: one with data that cause the evaluation to resolve TRUE and a separate test case where the data cause the decision to resolve FALSE. If you omit one test or the other, then you do not achieve decision coverage.
    • 4.3 Structure-Based 189 Our bug hypothesis was proved out in figure 4-36. An untested branch mayleave a landmine in the code that can cause a failure even though every line wasexecuted at least once. The example we went over may seem too simplistic to betrue, but this is exactly what often happens. The failure to set (or reset) a valuein the conditional causes a problem later on in the code. Note that, if we execute each decision both ways, giving us decision cover-age, it guarantees statement coverage in the same code. Therefore, decision cov-erage is said to be stronger than statement coverage. Statement coverage, as theweakest coverage, does not guarantee anything beyond each statement beingexecuted at least once. As before, we could calculate the exact level of decision coverage by dividingthe number of decision outcomes tested by the total number of decision out-comes in the code. For this book, we are going to speak as if we always want toachieve full—that is, 100 percent—decision coverage. In real life, your mileagemight vary. There are tools available to measure the extent of decision coverage. Having defined this higher level of coverage, let’s dig into it. The ability to take one rather than the other path in a computer is whatreally makes a computer powerful. Each computer program likely makes mil-lions of decisions a minute. But exactly what is a decision? As noted earlier, each decision eventually must resolve to one of two values,TRUE or FALSE. As seen in our code, this might be a really simple expression: ifa is greater than b, if n is less than zero. However, we often need to considermore complex expressions in a decision. In fact, a decision can be arbitrarilycomplex, as long as it eventually resolves to either TRUE or FALSE. The follow-ing is a legal expression which resolves to a single Boolean value: (a>b) || (x+y==-1) && ((d) != TRUE)This expression has three sub-expressions in it, separated by Boolean operators.First, we evaluate the sub-expression that determines if the sum of x plus y isequal to -1. That will be either TRUE or FALSE. We then AND that to the thirdsub-expression, which calculates whether d is FALSE or not. The result of thatcalculation is then ORed to the TRUE or FALSE result testing whether a isgreater than b. If that order of execution is not intuitive, it is based on the rulesthat govern the relative order of expression evaluation: ANDs are evaluatedbefore ORs, much the same way multiplications are evaluated before additionsin arithmetic.
    • 190 4 Test Techniques While this expression appears to be ugly, the compiler will evaluate the entire predicate to inform us whether it is a legal expression or not. This actually points out something every developer should do: Write code that is understand- able! Frankly, we would not insult the term understandable by claiming this expression is! There are a variety of different decisions that a programming language can make. Table 4–28 Either/or Decisions Loop decisions if (expr) while(expr) {} {} else {} switch (expr) { do case const_1: {} break; {} case const_2: {} break; while (expr) case const_3: {} break; case const_4: {} break; for (expr_1; expr_2; expr_3) default {} {} } Here, in table 4-28, you can see a short list of the decision constructs that the popular language C uses. Other popular languages use similar constructs. The commonality between all of these (and all of the other decision constructs in all of the other languages) is that each makes a decision that can go only two ways: TRUE or FALSE. Looking back at our original example in figure 4-35, with just two test cases we do not achieve decision coverage, even though we do attain statement cover- age. We did not “not” execute the for loop. Remember, a for loop evaluates an expression and decides whether to loop or not to loop based on the result. In order to get decision coverage, we need to test with the value 0 inputted as shown in figure 4-37. When 0 is entered, the first decision evaluates to FALSE, so we take the else path. At line 12, the predicate (1 less than or equal to 0) evaluates to FALSE, so the loop is not taken. Recall that earlier, we had tested with a value greater than 0, which did cause the loop to execute. Now we have achieved decision coverage; it took three test cases.
    • 4.3 Structure-Based 191 Lines 4,5,6 1 #include <stdio.h> 2 main () Lines 7,10 3 { if 4 int i, n, f; 5 printf (“n = “); 6 scanf (“%d”, &n); 7 if (n < 0) { 8 printf (“Invalid: %dn”, n); 9 n = -1; Lines 8,9 Line 11 10 } else { 11 f = 1; 12 for (i = 1; i <= n; i++) { 13 f *= i; 14 } 15 printf(“d! = %dn”, n, f); Line 12 for Line 13 16 } 17 return n; 18 } Test Values: Line 17 Line 15 n == 0 (white arrows)Figure 4–37 Decision coverage exampleThat brings us back to the same old question. Having tested to the decision levelof coverage, have we done enough testing? Consider the loop itself. We executed it zero times when we inputted 0. Weexecuted it an indeterminate number of times when we inputted a value greaterthan zero. Is that sufficient testing? Well, not surprisingly, there is another level of coverage called loop cover-age. We will look at that next.4.3.1.4 Loop CoverageWhile not strictly discussed in the ISTQB Advanced syllabus, loop testingshould be considered an important structural test technique. If we want to com-pletely test a loop, we would need to test it zero times (i.e., did not loop), onetime, two times, three times, all the way up to n times where it hits the maxi-mum it will ever loop. Bring your lunch; it might be an infinite wait! Like other exhaustive testing techniques, full loop testing is not a reasonableamount of coverage. Different theories have been offered that try to prescribehow much testing we should give a loop. The basic minimum tends to be twotests, zero times through the loop and one time through the loop. Others add athird test, multiple times through the loop, although they do not specify how
    • 192 4 Test Techniques many times. We prefer a stronger standard; we suggest that you try to test the loop zero and one time and then test the maximum number of times it is expected to cycle (if you know how many times that is likely to be). In a few moments, we will discuss Boris Beizer’s standard, which is even more stringent. We should be clear about loop testing. Some loops could execute an infinite number of times; each time through the loop creates one or more extra paths that could be tested. In the Foundation syllabus, a basic principle of testing was stated: “Exhaustive testing is impossible.” Loops, and the possible infinite varia- tions they can lead to are the main reason exhaustive testing is impossible. Since we cannot test all of the ways loops will execute, we want to be pragmatic here, coming up with a level of coverage that gives us the most information in the least amount of tests. Our bug hypothesis is pretty clear; failing to test the loop (especially when it does not loop even once) may cause bugs to be shipped to production. Lines 4,5,6 1 #include <stdio.h> 2 main () Lines 7,10 3 { if 4 int i, n, f; 5 printf (“n = “); 6 scanf (“%d”, &n); 7 if (n < 0) { 8 printf (“Invalid: %dn”, n); 9 n = -1; Lines 8,9 Line 11 10 } else { 11 f = 1; 12 for (i = 1; i <= n; i++) { 13 f *= i; 14 } 15 printf(“d! = %dn”, n, f); Line 12 for Line 13 16 } 17 return n; 18 } Test Values: Line 17 Line 15 n = 0 (white arrows) n = 1 (black arrows) Figure 4–38 Loop coverage 0 and 1 times through By entering a 1 (following the black arrows), we get the loop to execute just once. Upon entering line 12, the condition evaluates to (1 is less than or equal to 1), or TRUE. The loop executes and i is incremented. At this point, the condi- tion is evaluated again, ( 2 is less than or equal to 1), or FALSE. This causes us to drop out of the loop.
    • 4.3 Structure-Based 193 Zero and one time through the loop are relatively easy to come up with.How about the maximum times through? Each loop is likely to have a differentway of figuring that out; for some loops, it will be impossible to determine. In this code, we have a monotonically increasing value which makes it easyto calculate the greatest number of loops possible. The maximum size of thedata type used in the collector variable, f, will allow us to calculate the maxi-mum number of iterations. We need to compare the size of the calculated facto-rial against the maximum integer size (the actual data type would have to beconfirmed in the code). In table 4-29, we show a table with factorial values.Table 4–29 n n! 0 1 1 1 2 2 3 6 4 24 5 120 6 720 7 5,040 8 40,320 9 362,880 10 3,628,800 11 39,916,800 12 479,001,600 13 6,227,020,800 14 87,178,291,200 15 1,307,674,368,000Assuming a signed 32-bit integer being used to hold the calculation, the maxi-mum value that can be stored is 2,147,483,647. An input of 12 should give us avalue of 479,001,600. An input value of 13 would cause an overflow(6,227,020,800). If the programmer used an unsigned 32-bit integer with a max-imum size of 4,294,967,295, notice that the same number of iterations would beallowed; an input of 13 would still cause an overflow. In figure 4-39, we would go ahead and test the input of 12 and check theexpected output of the function.
    • 194 4 Test Techniques 1 #include <stdio.h> 2 main () Lines 4,5,6 3 { 4 int i, n, f; 5 printf (“n = “); 6 scanf (“%d”, &n); 7 if (n < 0) { Lines 7,10 8 printf (“Invalid: %dn”, n); if 9 n = -1; 10 } else { 11 f = 1; 12 for (i = 1; i <= n; i++) { 13 f *= i; Lines 8,9 Line 11 14 } 15 printf(“d! = %dn”, n, f); 16 } 17 return n; 18 } Test Values: Line 12 for Line 13 n = 12 (black arrows) Loops 12 times n = 13 (not shown) Line 17 Line 15 Overflow failure Figure 4–39 Loop coverage max times through It should be noted that our rule for loop coverage does not deal comprehensively with negative testing. Remember that negative testing is checking invalid values to make sure the failure they cause is graceful, giving us a meaningful error message and being able to recover from the error. We probably want to test the value 13 to make sure the overflow is handled gracefully. This is consistent with the concept of boundary value testing discussed in the previous section. Boris Beizer, in his book Software Testing Techniques, had suggestions for how to test loops extensively. Note that he is essentially covering the three point boundary values of the loop variable with a few extra tests thrown in. 1. If possible, test a value that is one less than the expected minimum value the loop can take. For example, if we expect to loop with the control variable going from 0 to 100, try -1 and see what happens. 2. Try the minimum number of iterations—usually zero iterations. Occasion- ally, there will be a positive number as the minimum. 3. Try one more than the minimum number. 4. Try to loop once (this test case may be redundant and should be omitted if it is.) 5. Try to loop twice (this might also be redundant.)
    • 4.3 Structure-Based 1956. Try to test a typical value. Beizer always believes in testing what he often calls a nominal value. Note that the number of loops, from one to max is actually an equivalence set. The nominal value is often an extra test that we tend to leave out.7. Try to test one less than the maximum value.8. Try to test the maximum number of loops.9. Try to test one more than the maximum value.The most important differentiation between Beizer’s guidelines and our loopcoverage described earlier is that he advocates negative testing of loops. Frankly,it is hard to argue against this thoroughness. Time and resources, of course, arealways factors that we must consider when testing. We would argue that hisguidelines are useful and should be considered when testing mission-critical orsafety-critical software. Finally, one of the banes of testing loops is when they are nested inside eachother. We will end this topic with Beizer’s advice for reducing the number oftests when dealing with nested loops.1. Starting at the innermost loop, set all outer loops to minimum iteration set- ting.2. Test the boundary values for the innermost loop as shown previously.3. If you have done the outermost loop already, go to step 5.4. Continue outward, one loop at a time until you have tested all loops.5. Test the boundaries of all loops simultaneously. That is, set all to 0 loops, 1 loop, maximum loops, 1 more than maximum loops.Beizer points out that practicality may not allow hitting each one of these valuesat the same time for step 5. As guidelines go, however, these will ensure prettygood testing of looping structures.4.3.1.5 Hexadecimal Converter ExerciseIn figure 4-40, you’ll find a C program that accepts a string with hexadecimalcharacters (among other unwanted characters). It ignores the other charactersand converts the hexadecimal characters to a numeric representation. If a Ctrl-Cis inputted, the last digit that was converted is removed from the buffer. If you test with input strings “24ABd690BBCcc” and “ABCdef1234567890”,what level of coverage will you achieve?
    • 196 4 Test Techniques What input strings could you add to achieve statement and branch cover- age? Would those be sufficient for testing this program? The answers are shown in the next section. Figure 4–40 Hexadecimal converter code
    • 4.3 Structure-Based 197 ISTQB Glossary condition testing: A white-box test design technique in which test cases are designed to execute condition outcomes.4.3.1.6 Hexadecimal Converter Exercise DebriefThe strings “24ABd690BBCcc” and “ABCdef1234567890” do not achieve anyspecified level of coverage that we have discussed in this book.To get statement and decision coverage, you would need to add the following:■ At least one string containing a signal (Ctrl-C) to execute the signal() and pophdigit() code■ At least one string containing a non-hex digitIn order to get loop coverage, you would have to add the following:■ At least one empty string to cause the while loop on line 17 not to loop■ One test that inputs the maximum number of hex digits that can be stored (based on the data type of nhex)There are also some interesting test cases based on placement that we wouldlikely run. At least one of these we would expect to find a bug (hint, hint):■ A string containing only a signal (Ctrl-C)■ A string containing a signal at the first position■ A string with more hex digits than can be stored■ A string with more signals than hex digits4.3.1.7 Condition CoverageSo far we have looked at statement coverage (have we exercised every line ofcode?), decision coverage (have we exercised each decision both ways?), andloop coverage (have we exercised the loop enough?). As Ron Popeil, the purveyor of many a fancy gadget in all-night infomer-cials used to say, “But wait! There’s more!” While we have exercised the decision both ways, we have yet to discuss howthe decision is made. Earlier, when we discussed decisions, we saw that theycould be simple—such as A is greater than B—or arbitrarily complex as follows: (a>b) || (x+y==-1) && (d) != TRUE
    • 198 4 Test Techniques If we input data to force the entire condition to TRUE, then we go one way; force it FALSE and we go the other way. At this point we have achieved decision coverage. But is that enough testing? Could there be bugs hiding in the condition itself? The answer, of course, is a resounding yes! And, if we know there might be bugs there, we might want to test more. Our next level of coverage is called condition coverage. The basic concept is that, when a decision is made by a complex expression that eventually evaluates to TRUE or FALSE, we want to make sure that each atomic condition is tested both ways, TRUE and FALSE. An atomic condition is defined as “the simplest form of code that can result in a TRUE or FALSE outcome.”9 Our bug hypothesis is that defects may lurk in untested atomic conditions, even though the full decision has been tested both ways. As always, test data must be selected to ensure that each atomic condition actually be forced TRUE and FALSE at one time or another. Clearly, the more complex a decision expres- sion is, the more test cases we will need to execute to achieve this level of cover- age. Once again, we could evaluate this coverage for values less than 100 percent by dividing the number of Boolean operand values executed by the total num- ber of Boolean operand values there are. But we won’t. Condition coverage, for this book, will only be discussed as 100 percent coverage. Note an interesting corollary to this coverage. Decision and condition cov- erage will always be exactly the same when all decisions are made by simple atomic expressions. For example, for the conditional if (a == 1) {}, condition coverage will be identical to decision coverage. Here, we’ll look at some examples of complex expressions, all of which eval- uate to TRUE or FALSE. x The first, shown above, has a single Boolean variable, x, which might evaluate to TRUE or FALSE. Note that in some languages, it does not even have to be a Boolean variable. For example, if this were the C language, it could be an inte- 9. Definition from The Software Test Engineers Handbook, Bath and McKay
    • 4.3 Structure-Based 199ger, as any non-zero value constitutes a TRUE value and zero is deemed to beFALSE. The important thing to notice is that it is an atomic and it is also theentire expression. D && FThe second, shown above, has two atomic conditions, D and F, which are com-bined together by the AND operator to determine the value of the wholeexpression. (A || B) && (C == D)The third is a little tricky. In this case, A and B are both atomic conditions thatare combined together by the OR operator to calculate a value for the sub-expression (A || B). Because A and B are both atomic conditions, the sub-expression (A || B) cannot be an atomic condition. However, (C == D) is anatomic condition as it cannot be broken down any further. That makes a total ofthree atomic conditions. (a>b)||(x+y==-1)&&((d)!=TRUE)In the last complex expression, shown above, there are again three atomic con-ditions. The first, (a>b), is an atomic condition that cannot be broken down fur-ther. The second, (x+y == -1), is an atomic following the same rule. In the lastsub-expression, (d!=TRUE) is the atomic condition. Just for the record, if we were to see the last expression in actual code dur-ing a code review, we would jump all over it with both feet. Unfortunately, it isan example of the way some people program. In each of the preceding examples, we would need to come up with suffi-cient test cases that each of these atomic conditions was tested where it evalu-ated both TRUE and FALSE. Should we test to that extent, we would achieve100 percent condition coverage. That means the following:■ x, D, F, A, and B would each need a test case where it evaluated to TRUE and one where it evaluated to FALSE.■ (C==D) needs two test cases.■ (a>b) needs two test cases.■ (x+y==-1) needs two test cases.■ ((d)!=TRUE) needs two test cases.
    • 200 4 Test Techniques ISTQB Glossary condition determination testing: A white-box test design technique in which test cases are designed to execute single condition outcomes that inde- pendently affect a decision outcome. Surely that must be enough testing. Maybe, maybe not. Consider the following pseudo code: if (A && B) then {Do something} else {Do something else} To achieve condition coverage, we need to ensure that each atomic condition go both TRUE and FALSE in at least one test case each. Test 1: A == FALSE, B == TRUE resolves to FALSE Test 2: A == TRUE, B == FALSE resolves to FALSE Assume that our first test case has the values inputted to make A equal to FALSE and B equal to TRUE. That makes the entire expression evaluate to FALSE, so we execute the else clause. For our second test, we reverse that so A is set to TRUE and B is set to FALSE. That evaluates to FALSE, so we again execute the else clause. Do we now have condition coverage? A was set both TRUE and FALSE, as was B. Sure, we have achieved condition coverage. But ask yourself, do we have decision coverage? The answer is no! At no time, in either test case, did we force the decision to resolve to TRUE. Condition coverage does not automatically guarantee decision coverage. We will look at two more levels of coverage that give stronger coverage than simple condition coverage, at the cost of more testing, of course. 4.3.1.8 Decision/Condition Coverage The first of these we will call decision/condition coverage. In the ISTQB Advanced syllabus, this is called condition determination coverage. This level of coverage is just a combination of decision and condition coverage (to solve the
    • 4.3 Structure-Based 201shortcoming of condition-only coverage pointed out in the last section). Theconcept is that we need to achieve condition-level coverage where each atomiccondition is tested both ways, TRUE and FALSE, and we also need to make surethat we achieve decision coverage by assuring that the overall expression betested both ways, TRUE and FALSE. To test to this level of coverage, we must assure that we have condition cov-erage, and then make sure we evaluate the full decision both ways. The bughypothesis should be clear from the preceding discussion; not testing for deci-sion coverage even if we have condition coverage may allow bugs to remain inthe code. Decision/condition coverage guarantees condition coverage, decision cov-erage, and statement coverage. Going back to the previous example again where we have condition cover-age but not decision coverage, we already had two test cases as follows: onewhere A is set to FALSE and B is set to TRUE and one where A is set to TRUEand B is set to FALSE. By adding a third test case where both A and B are set toTRUE, we now force the expression to evaluate to TRUE, so we now have deci-sion coverage also. Whew! Finally, with all of these techniques, we have done enough testing.Right? Well maybe.4.3.1.9 Modified Condition/Decision Coverage (MC/DC)There is yet a stronger level of coverage that we must discuss. This one is calledmodified condition/decision coverage (usually abbreviated to MC/DC). This level of coverage is considered stronger because we add another factorto what we were already testing in decision/condition coverage. Like decision/condition coverage, MC/DC requires that each atomic condition be tested bothways and that decision coverage must be satisfied. It then adds one more factor:Each condition must affect the outcome decision independently while the otheratomic conditions are held fixed. To get to this level of coverage, we determine the test cases needed for deci-sion/condition and then evaluate our tests to see if each atomic factor had thepossibility of affecting the outcome decision independently while not varying
    • 202 4 Test Techniques the other atomic condition values. Our bug hypothesis states that we might find an example of a bug hiding in that last little space that we have not tested. Many references use the term MC/DC as a synonym for exhaustive testing. However, while achieving MC/DC will force a lot of testing, we do not feel it rises to the level of exhaustive testing, which would include running every loop every possible number of times, etc. MC/DC coverage may not be useful to all testers. It was originally created at Boeing for use when specific languages were to be used in safety-critical pro- gramming. It is the required level of coverage under FAA DO/178B when the software is judged to be Level A (possible catastrophic consequences in case of failure). We discussed this standard in chapter 2. Testing for MC/DC coverage can be complicated. There are two issues that must be discussed: short circuiting and multiple occurrences of a condition. First, we will look at short circuiting. What does short circuiting mean? Some programming languages are defined in such a way that Boolean expressions may be resolved without going through every sub-expression, depending on the Boolean operator that is used. Table 4–30 Value of A Value of B A||B T T T T F T F T T F F F Consider what happens when a runtime system that has been built using a short- circuiting compiler resolves the Boolean expression (A || B) (see table 4-30). If A by itself evaluates to TRUE, then it does not matter what the value of B is. At exe- cution time, when the runtime system sees that A is TRUE, it does not even bother to evaluate B. This saves execution time. C++ and Java are examples of languages exhibiting this behavior. Some fla- vors of C short-circuit expressions. Pascal does not short-circuit, but Delphi has a compiler option to allow short circuiting if the programmer wants it. Note that both the OR and the AND short-circuit in different ways. Con- sider the expression (A || B). When the Boolean operator is an OR, it will short- circuit when the first term is TRUE since the second term does not matter. If the
    • 4.3 Structure-Based 203Boolean operator was an AND (&&), it would short-circuit when the first termwas FALSE, since no value of the second term would prevent the expressionevaluating as FALSE. There may be a very serious bug issue with short circuiting. If an atomiccondition is supplied by a called function and it is short-circuited out of evalua-tion, then any side effects that might have occurred through its execution arelost. Oops! For example, assume that instead of (A || B)the actual expression is (A || funcB())Further suppose that a side effect of funcB() is to initialize a data structure thatis to be used later. In her code, the developer could easily assume that the datastructure is already initialized since she knew that the predicate had been usedin a conditional. When she tries to use the noninitialized data structure, itcauses a failure at runtime. To quote Robert Heinlein, for testers, “there aint nosuch thing as a free lunch.” If B is never evaluated, the question then becomes, Can we still achieve MC/DC coverage? The answer seems to be maybe. Our research shows that a num-ber of organizations have weighed in on this subject in different ways; unfortu-nately, it is beyond the scope of this book to cover all of the different opinions. Ifyour project is subject to regulatory statutes, and the development group isusing a language or compiler that short-circuits, our advice is to researchexactly how that regulatory body wants you to deal with the short circuiting. The second issue to be discussed is when there are multiple occurrences of acondition in an expression. Consider the following pseudo-code predicate: A || (! A && B)In this example, A and !A are said to be coupled. They cannot be varied inde-pendently as MC/DC coverage says they must. As with short circuiting, thereare (at least) two approaches to this problem. One approach is called Unique Cause MC/DC. In this approach, the termcondition in the definition of MC/DC is deemed to mean uncoupled conditionand the question becomes moot.
    • 204 4 Test Techniques The other approach is called Masking MC/DC, and it permits more than one condition to vary at once; the tester must perform analysis, on a case-by- case basis, the logic of the predicate to ensure that only one condition of interest influences the decision. Once again, our suggestion is to follow whatever rules are imposed upon you by the regulatory body that can prevent the release of your system. In the interest of looking at an example that does not lure us deep into the weeds, let us assume we are using a compiler that does not short-circuit and we will not use any coupled conditions. How do we achieve MC/DC coverage? Consider the code snippet that follows: if ((A OR B) AND C) then… We have three atomic conditions, A, B and C. We can achieve decision/condi- tion coverage by running two test cases as shown: 1. A set to TRUE, B set to TRUE, C set to TRUE where the expression evalu- ates to TRUE 2. A set to FALSE, B set to FALSE, C set to FALSE where the expression evalu- ates to FALSE Note that B never independently affects the outcome of the decision since the value of A always overrides it. We can achieve the desired MC/DC coverage by changing the test inputs slightly and adding a single test. 1. A set to TRUE, B set to FALSE, C set to TRUE, which evaluates to TRUE 2. A set to FALSE, B set to TRUE, C set to TRUE, which evaluates to TRUE 3. A set to TRUE, B set to TRUE, C set to FALSE, which evaluates to FALSE As you can see, each one of the atomic conditions changes the output evaluation at least once in the test set as long as the other conditions are held steady. If A changed value in test 1, the result would be inverted, likewise B in test 2 and C in test 3. Wow! This is really down in the weeds. Should you test to MC/DC level? Well, if your project needed to follow the standard FAA DO/178B and the par- ticular software was of Level A criticality, the easy answer is yes, you would need to test to this level. Remember, Level A criticality means that, if the software
    • 4.3 Structure-Based 205 ISTQB Glossary multiple condition testing: A white-box test design technique in which test cases are designed to execute combinations of single condition outcomes (within one statement). multiple condition testingwere to fail, catastrophic results would likely ensue. If you did not test to thislevel, you would not be able to sell your software or incorporate it in anymodule for the project. As always, context matters. The amount of testing should be commensuratewith the amount of risk.4.3.1.10 Multiple Condition CoverageIn this section, we come to the natural culmination of all of the previously dis-cussed control-flow coverage schemes. Each one gave us a little more coverage,often by adding more tests. Our final control-flow coverage level is called multi-ple condition coverage. The concept: Test every possible combination of atomicconditions in a decision! This one is truly exhaustive testing as far as the combi-nations that atomic conditions can take on. Go through every possible permu-tation—and test them all. Creating a set of tests does not take much analysis. Simply create a truthtable with every combination of atomic conditions, TRUE and FALSE, andcome up with values that test each one. The number of tests is directly propor-tional to the number of atomic conditions, using the formula 2n power where nis the number of atomic conditions. The bug hypothesis is really pretty simple also: Bugs could be anywhere! Once again, we could conceivably measure the amount of multiple conditioncoverage by calculating the theoretical possible numbers of permutations wecould have and dividing those into the number that we actually test. However,instead we will simply discuss the hypothetical perfect: 100 percent coverage.Using the same predicate we discussed earlier if ((A OR B) AND (C)) then ...we can devise the truth table shown in table 4-31. Since there are three atomicconditions, we would expect to see 23, or 8 unique rows in the table.
    • 206 4 Test Techniques Table 4–31 A B C (A OR B) AND C FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Theoretically, we would create a unique test case for every row. Why theoretically? Well, we have the same issues that popped up earlier, the concept of short circuiting and coupling. Depending on the order of the Boolean operators, the number of test cases that are interesting (i.e., achievable in a meaningful way) will differ when we have a shortcircuiting compiler. As a further example, assume that we have five atomic conditions, A, B, C, D, and E. Assuming that we have a compiler that does not short-circuit, we will have 32 test cases to run, as shown by table 4-32. No matter how they are inter- connected logically, these will be our test data. Table 4–32 Test A B C D E 1 TRUE TRUE TRUE TRUE TRUE 2 TRUE TRUE TRUE TRUE FALSE 3 TRUE TRUE TRUE FALSE TRUE 4 TRUE TRUE TRUE FALSE FALSE 5 TRUE TRUE FALSE TRUE TRUE 6 TRUE TRUE FALSE TRUE FALSE 7 TRUE TRUE FALSE FALSE TRUE 8 TRUE TRUE FALSE FALSE FALSE 9 TRUE FALSE TRUE TRUE TRUE 10 TRUE FALSE TRUE TRUE FALSE 11 TRUE FALSE TRUE FALSE TRUE 12 TRUE FALSE TRUE FALSE FALSE 13 TRUE FALSE FALSE TRUE TRUE 14 TRUE FALSE FALSE TRUE FALSE
    • 4.3 Structure-Based 207 15 TRUE FALSE FALSE FALSE TRUE 16 TRUE FALSE FALSE FALSE FALSE 17 FALSE TRUE TRUE TRUE TRUE 18 FALSE TRUE TRUE TRUE FALSE 19 FALSE TRUE TRUE FALSE TRUE 20 FALSE TRUE TRUE FALSE FALSE 21 FALSE TRUE FALSE TRUE TRUE 22 FALSE TRUE FALSE TRUE FALSE 23 FALSE TRUE FALSE FALSE TRUE 24 FALSE TRUE FALSE FALSE FALSE 25 FALSE FALSE TRUE TRUE TRUE 26 FALSE FALSE TRUE TRUE FALSE 27 FALSE FALSE TRUE FALSE TRUE 28 FALSE FALSE TRUE FALSE FALSE 29 FALSE FALSE FALSE TRUE TRUE 30 FALSE FALSE FALSE TRUE FALSE 31 FALSE FALSE FALSE FALSE TRUE 32 FALSE FALSE FALSE FALSE FALSENow, let’s assume that we are using a compiler that does short-circuit predicateevaluation. The first example, using the same five atomic conditions, are logi-cally connected as follows: A && B && (C || (D && E))Necessary test cases, taking into consideration short circuiting, are seen in table4-33. Note we started off with 32 different rows depicting all of the possiblecombinations that five atomic conditions could take, as shown in table 4-32.Table 4–33 Test A B C D E Decision 1 FALSE - - - - FALSE 2 TRUE FALSE - - - FALSE 3 TRUE TRUE FALSE FALSE - FALSE 4 TRUE TRUE FALSE TRUE FALSE FALSE 5 TRUE TRUE FALSE TRUE TRUE TRUE 6 TRUE TRUE TRUE - - TRUE
    • 208 4 Test Techniques The first thing we see is that 16 of those rows start with A being set to FALSE. Because of the AND operator, if A is FALSE, the other four terms are all short- circuited out and never evaluated. We must test this once, in test 1, but the other 15 rows (18–32) are dropped as being redundant testing. Likewise, if B is set to FALSE, even with A TRUE, the other terms are short- circuited and dropped. Since there are eight rows where this occurs (9–16), seven of those are dropped and we are left with test 2. In test 3, if D is set to FALSE, the E is never evaluated. In tests 4 and 5, each term matters and hence is included. In test 6, once C evaluates to TRUE, D and E no longer matter and are short-circuited. Out of the original possible 32 tests, only 6 of them are interesting. For our second example, we have the same five atomic conditions arranged in a different logical configuration as follows: ((A || B) && (C || D)) && E In this case, we again have five different atomic conditions and so we start with the same 32 rows possible, as shown in table 4-32. Notice that anytime E is FALSE, the expression is going to evaluate to false. Since there are 16 different rows where E is FALSE, you might think that we would immediately get rid of 15 of them. While a smart compiler may be able to figure that out, most won’t since the short circuiting normally goes from left to right. This predicate results in a different number of tests, as seen in table 4-34. Table 4–34 Test A B C D E Decision 1 FALSE FALSE - - - FALSE 2 FALSE TRUE FALSE FALSE - FALSE 3 FALSE TRUE FALSE TRUE FALSE FALSE 4 FALSE TRUE FALSE TRUE TRUE TRUE 5 FALSE TRUE TRUE - FALSE FALSE 6 FALSE TRUE TRUE - TRUE TRUE 7 TRUE - FALSE FALSE - FALSE 8 TRUE - FALSE TRUE FALSE FALSE 9 TRUE - FALSE TRUE TRUE TRUE 10 TRUE - TRUE - FALSE FALSE 11 TRUE - TRUE - TRUE TRUE
    • 4.3 Structure-Based 209For test 1, we have both A and B being set to FALSE. That makes the C and Dand E short-circuit because the AND signs will ensure that they do not matter.We lose seven redundant cases right there. For test 2, we lose a single test case because with C and D both evaluating toFALSE, the entire first four terms evaluate to FALSE, making E short-circuit. With tests 3 and 4, we must evaluate both expressions completely becausethe expression in the parentheses evaluates to TRUE, making E the proximatecause of the output expression. For both tests 5 and 6, the D atomic condition is short-circuited because Cis TRUE; however, we still need to evaluate E because the entire calculation inthe parentheses evaluates to TRUE. You might ask how can we short-circuit Dbut not E? Remember that the compiler will output code to compute values inparentheses first; therefore, the calculation inside the parentheses can be short-circuited, but the expression outside the parentheses is not affected. Likewise, for tests 7, 8, 9 10, and 11, we short-circuit B since A is TRUE. Fortest 7, we can ignore E since the sub-expression inside the parentheses evaluatesto FALSE. Tests 8 and 9 must evaluate to the end since the calculation in theparentheses evaluates to TRUE. And finally, 10 and 11 also short-circuit D. Essentially, when we’re testing to multiple condition coverage with a short-circuiting language, each sub-expression must be considered to determinewhether short circuiting can be applied or not. Of course, this is an exercise thatcan be done during the analysis and design phase of the project when we are noton the critical path (before the code is delivered into test). Every test case thatcan be thrown out because of this analysis of short circuiting will save us timewhen we are on the critical path (i.e., actually executing test cases).4.3.1.11 Control-Flow ExerciseFollowing is a snippet of Delphi code that Jamie wrote for a test managementtool. This code controls whether to allow a drag-and-drop action of a test caseartifact to continue or whether to return an error and disallow it.■ Tests are leaf nodes in a hierarchical feature/requirement/use-case tree.■ Each test is owned by a feature, a requirement, or a use case. Tests cannot own other tests.■ Tests can be copied, moved, or cloned from owner to owner.
    • 210 4 Test Techniques ■ If a test is dropped on another test, it will be copied to the parent of the target test. ■ This code was designed to prevent a test from being copied to its own parent, unless the intent was to clone it. ■ This code was critical … and buggy. Using the code snippet in figure 4-41: 1. Determine the total number of tests needed for multiple condition cover- age. 2. If the compiler is set to short-circuit, which of those tests are actually needed? 3. Determine the tests required for decision/condition (condition determina- tion) coverage. // Dont allow test to be copied to its own parent when dropped on test If ( (DDSourceNode.StringData2 = Test) AND (DDTgtNode.StringData2 = Test) AND (DDCtrlPressed OR DDShiftPressed) AND (DDSourceNode.Parent = DDTgtNodeParent) AND (NOT DoingDuplicate)) Then Begin Raise TstBudExcept.Create("You may not copy test to its own parent.) End; Figure 4–41 The answer is shown in the following section. 4.3.1.12 Control-Flow Exercise Debrief 1. Determine the total number of tests needed for multiple condition coverage. We find it is almost always easier to rewrite this kind of predicate using simple letters as variable names than to do the analysis using the long names. The long names are great for documenting the code, just hard to use in an analysis. If ( (DDSourceNode.StringData2 = Test) AND (DDTgtNode.StringData2 = Test) AND (DDCtrlPressed OR DDShiftPressed) AND (DDSourceNode.Parent = DDTgtNode.Parent) AND (NOT DoingDuplicate)) Then Begin
    • 4.3 Structure-Based 211This code translates to if ((A) AND (B) AND (C OR D) AND (E) AND (NOT F)) thenwhere A = DDSourceNode.StringData2 = Test B = DDTgtNode.StringData2 = Test C = DDCtrlPressed D = DDShiftPressed E = DDSourceNode.Parent = DDTgtNode.Parent F = DoingDuplicateWith six different atomic conditions, we can build a truth table to determine thenumber of multiple condition tests that we would need to test without short-circuiting. table 4-35 contains the truth tables to cover the six variables; hence64 tests would be needed for multiple condition coverage.Table 4–35 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 A T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T B T T T T T T T T T T T T T T T T F F F F F F F F F F F F F F F F C T T T T T T T T F F F F F F F F T T T T T T T T F F F F F F F F D T T T T F F F F T T T T F F F F T T T T F F F F T T T T F F F F E T T F F T T F F T T F F T T F F T T F F T T F F T T F F T T F F F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 AF F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F BT T T T T T T T T T T T T T T T F F F F F F F F F F F F F F F F CT T T T T T T T F F F F F F F F T T T T T T T T F F F F F F F F DT T T T F F F F T T T T F F F F T T T T F F F F T T T T F F F F E T T F F T T F F T T F F T T F F T T F F T T F F T T F F T T F F F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F
    • 212 4 Test Techniques 2. If the compiler is set to short-circuit, which of those tests are actually needed? If the compiler generated short-circuiting code, the actual number of test cases would be fewer. We try to do this in a pattern based on the rules of short circuit- ing (i.e., the compiler generates code to evaluate from left to right subject to the general rules of scoping and parentheses, and as soon as the outcome is deter- mined, it stops evaluating). Test cases: 1. A = FALSE, all others don’t matter, resolves to FALSE. With A FALSE, it does not matter what any of the other atomic conditions are; the whole con- dition evaluates to FALSE. Tests 33–64 are all covered by this one test. 2. A = TRUE, B FALSE, others don’t matter, resolves to FALSE. With A TRUE and B FALSE, it does not matter what the other atomic conditions are; the whole condition evaluates to FALSE. Tests 17–32 are covered by this one test. 3. A,B = TRUE, C,D = FALSE, all others don’t matter, resolves to FALSE. With A and B TRUE and both C and D FALSE, the remaining atomic conditions will not be evaluated; the entire condition evaluates to FALSE. This test cov- ers tests 13–16. 4. A,B = TRUE, one or both C,D = TRUE, E = FALSE, F does not matter, resolves to FALSE. With A, B = TRUE and either one or both of C,D held TRUE, if E is FALSE, it does not matter what value F takes; the entire condi- tion evaluates to FALSE. This test covers tests 3, 4, 7, 8, 11, and 12. All the rest of the tests (1, 2, 5, 6, 9, and 10) are singletons that must be tested because the deciding factor in each is the value of F, the last atomic condition. That is, A, B, C, D, and E together all evaluate to TRUE. They are ANDed to (NOT F) to resolve the full expression, meaning the value of F determines the final result. In these cases, when F is TRUE, the expression is FALSE and vice versa. 5. A,B,C,D,E,F = TRUE resolves to FALSE (1) 6. A,B,C,D,E = TRUE, F = FALSE resolves to TRUE (2) 7. A,B,C,E,F = TRUE, D = FALSE resolves to FALSE (5) 8. A,B,C,E = TRUE, D,F = FALSE resolves to TRUE (6) 9. A,B,D,E,F = TRUE, C = FALSE resolves to FALSE (9) 10. A,B,D,E = TRUE, C,F = FALSE resolves to TRUE (10)
    • 4.3 Structure-Based 213Therefore, we would actually need to run 10 test cases to achieve multiple con-dition coverage when the compiler short circuits.3. Determine the tests required for decision/condition (condition determination)coverage.The final portion of this exercise is to determine test cases needed for decision/condition (condition determination) testing. Remember from the earlier section that we need two separate rules to besatisfied:1. Each atomic condition must be tested both ways.2. Decision coverage must be tested.It is a matter then of making sure we have both of those covered.Table 4–36 Test # A B C D E F Result 1 T T T T T T F 2 F F F F F F F 3 T T T T T F TTests 1 and 2 in table 4-36 evaluate each of the atomics both ways, and test 3gives us decision coverage. Note that this does not seem like much testing. Wemight want to go from here to MC/DC coverage. While this was not required inthe exercise, it is good practice.There are three rules to modified condition/decision coverage:1. Each atomic condition must be evaluated both ways.2. Decision coverage must be satisfied.3. Each condition must affect the outcome independently when the other con- ditions are held steady.
    • 214 4 Test Techniques ISTQB Glossary path testing: A white-box test design technique in which test cases are designed to execute paths. Table 4–37 Test # A B C D E F Result 1 T T T T T T F 2 T T T T T F T 3 F T T T T F F 4 T F T T T F F 5 T T T F T F F 6 T T F T T F F 7 T T T T F F F Test 1: All values are set to TRUE with an output of FALSE. Test 2: F affects output (if it moved to TRUE, it would change output). Test 3: A affects output (if it moved to TRUE, it would change output). Test 4: B affects output (if it moved to TRUE, it would change output). Tests 5 and 6: C and D affect output (if either one moved to TRUE while the other held, it would change output). Test 7: E affects output (if it moved TRUE, it would change output). 4.3.2 Path Testing We have looked at many different control-flow coverage schemes. But there are other ways to design test cases using the structure as a test basis. In this section, we will look at path testing. Remember that we have covered three main differ- ent ways of approaching test design so far: 1. Statement testing, where the statements themselves drove the coverage. 2. Decision testing, where the branches drove the coverage. 3. Condition, decision/condition, modified condition/decision coverage, and multiple condition coverage, all of which looked at sub-expressions and atomic conditions of a particular decision.
    • 4.3 Structure-Based 215 ISTQB Glossary dd-path: A path of execution (usually through a graph representing a pro- gram, such as a flow chart) that does not include any conditional nodes such as the path of execution between two decisions. LCSAJ: Linear Code Sequence and Jump, consists of the following three items (conventionally identified by line numbers in a source code listing): the start of the linear sequence of executable statements, the end of the linear sequence, and the target line to which control-flow is transferred at the end of the linear sequence.Now we are going to look at path testing. Rather than concentrating on control-flows, path testing is going to try to identify interesting paths through the codethat we may want to test. Frankly, in some cases we will get some of the sametests we got through control-flow testing. On the other hand, we might come upwith some other interesting tests. When we discuss path testing, we can start out with a bad idea. Brute forcetesting: Test every independent path through the system. Got infinity? That’show long it would take for any non-trivial system. Loops are the killers here.Every loop is a potential black hole where each different iteration through theloop creates more paths. We would need infinite time and infinite resources.Well, it was an idea... Perhaps we can subset the infinite loops to come up with a smaller numberof possible test cases that are interesting. There are a couple ways of doing this.4.3.2.1 LCSAJFirst, let’s start with Linear Code Sequence and Jump testing. Since that is such amouthful, we will simply call it LCSAJ testing. LCSAJs are small blocks of codethat fit a particular profile. There are three artifacts that constitute an LCSAJ,each one a number that represents a line of code. First, there is the starting line. There are two ways an LCSAJ can start: eitherat the beginning of a module or at a line that was jumped to by another LCSAJ. Second, there is the end of the LCSAJ. There are two ways one can end:either at the end of a module or at a line of code where a jump occurs. A jumpmeans that we do not move to the next sequential line of code but rather load adifferent address and jump to it, beginning execution there.
    • 216 4 Test Techniques Third, there is the target of the jump. This is also a line number. ISTQB, in the Advanced syllabus, has deemed that LCSAJ is synonymous to DD-Path testing (where DD stands for decision to decision).10 The LCSAJ analysis method was devised by Professor Michael Hennell of the University of Liverpool so he could perform quality assessments on mathe- matical libraries for nuclear physics in 1976. The value of using LCSAJ will be discussed in the upcoming section; for now, we can consider it a relatively high overhead methodology for testing. Formally, we say that software modules are made up of linear sequences of code which are jumped to, executed, and then jumped from. Our earlier defini- tion, small blocks of code, matches this formal definition well. Building tests from LCSAJs starts with first identifying them and then creating data to force their traversal. Our coverage measurement should look familiar; it is the number of LCSAJs that are executed divided by the total number that exist. This metric is indeed interesting and we will discuss it later. Finally, the bug hypothesis is pretty much what you might think: Bugs might hide in blocks of code that escape execution. We are going to show an example that is small enough to be clear and large enough to be non-trivial. This example was taken from Wikipedia. Figure 4-42 contains the code and an LCSAJ table. The algorithm for designing a test is straightforward: 1. A path starts at either the start of the module or a line that was jumped to from somewhere. 2. Execution follows a linear, sequential path until it gets to a place where it must jump. 3. Find data that force execution through that path and use it in a test case. 10. Not every source agrees with this analysis. Edward F. Miller was credited with devising deci- sion-to-decision path testing as a structural testing technique in the tutorial Program Testing Techniques, held during the first Computer Software and Applications Conference (COMPSAC), 1977. Since LCSAJ coverage is a relatively minor and seldom-used code coverage metric that does not have the practical, real-world applicability of statement, branch/decision, condition decision, modified condition decision, and multiple condition coverage, it’s of limited value to try to get to the root of the disagreement.
    • 4.3 Structure-Based 217One hundred percent coverage depends on identifying all paths and then test-ing them. Again, we will discuss this after the example. We have identified eight separate LCSAJs in the code in figure 4-42.Figure 4–42 LCSAJ example code and tableThe first starts at the beginning of the module (line 10), executes to the whileloop (line 17), and then jumps to the end of the while loop (meaning it neverexecuted the loop). This block is a perfect example of why LCSAJs are so prob-lematic for testing. In order for this block of code to execute this way, we wouldneed to find a time that the while loop does not loop. The Boolean expressionfor the while loop (count < ITERATIONS) will always evaluate to TRUE andthence always loop given the values that they are initialized to. The variablecount is initialized to 0 in line 16. That is always going to happen when thisfunction is started. Likewise, ITERATIONS is a named constant initialized to750, a magic number that is also invariant. Since 0 is always going to be less than750 upon first execution, this LCSAJ can never be executed unless we force it to
    • 218 4 Test Techniques through the use of a debugger. The question to be asked is, Is it worthwhile forc- ing a test for a condition that can literally never happen? LCSAJ 2 is going to execute. It also starts at the beginning of the module and continues executing until it reaches line 21. That means the while loop fires. Val is set with a random number between 0 and MAXCOLUMNS (26), the array totals at index [val] is set to 1 (0 + 1 actually), and the if condition is evaluated. Since the array at the index was just set to 1, the if statement on this first itera- tion will always evaluate to FALSE, causing the jump to line 25. LCSAJ 3 is never going to execute for the reason just given in LCSAJ 2. This block would require the system to start, go to the while and loop, go to the if statement in line 21, and evaluate TRUE. Since this is impossible given the val- ues involved, it cannot be executed without forcing values through a debugger. LCSAJ 4 will execute once each time through the function. It starts at line 17 and jumps from 17 to the end of the while loop at line 28. When does this happen? The last time we evaluate the condition (count < ITERATIONS)—that is, after the 750th time the while loops. At that point, count will have been incre- mented 750 times and the condition will determine that (750 < 750), which is FALSE. At this point the jump occurs. LCSAJ 5 will occur a great number of times, every time the if condition evaluates to FALSE. We start at the while loop (line 17), go to the if statement (line 21), evaluate FALSE, and jump to the end of the if, line 25. LCSAJ 6 may or may not execute, depending on the random numbers that are generated. We start at the while loop, go into the loop, and continue to the if statement. If the particular array cell pointed to by val has been incremented enough times, then the condition comparing that cell’s value with MAXCOUNT may actually be TRUE. At this point, with the condition evaluating TRUE, we would continue into the then part of the if, executing line 23, continuing on to line 25, and finally looping back to 17 from line 26. LCSAJ 7 will occur every time we are in the loop and the if statement at line 21 evaluates to FALSE. The jump lands us at line 25, we increment count and then loop back to 17, the while evaluation. And finally, the last LCSAJ, 8, will occur after the while loop finally ends. We jump to 28 to start the block, execute line 28 (the body of the block) and then end the function. Make sure you understand each of these blocks before continuing on.
    • 4.3 Structure-Based 219 There are a number of issues that we must discuss when trying to useLCSAJs. As shown, a number of LCSAJs are really not possible to execute—there is no known way of testing them without radically changing initializationsor forcing the code by setting values with a debugger. That means we can almostnever get to 100 percent coverage. There are those who insist that forcing thecode to execute is essential for good testing, even when it would appear to benonsensical. We wonder whether the time needed to execute the “impossible”might not be better spent executing other test cases.A paper detailing LCSAJ testing states it this way: “The large amount of analysis required for infeasible LCSAJs is the main reason LCSAJ coverage is not a realistically achievable test metric” 11All things are relative of course. If our lives depended on this code never failingand we had unlimited resources, we might try to achieve LCSAJ coverage. Wecertainly don’t believe that this is a reasonable technique to use on every projectwe work on. There are also some pragmatic reasons to think long and hard before invest-ing time using this particular path design methodology. Since identifyingLCSAJs can only be done after the code has been delivered, we get no early test-ing when performing this technique. And, since small changes to the code cancause radical changes to the LCSAJs, it is a particularly brittle technique to use,which is likely to lead to very expensive maintenance issues. The holy grail of testing would be to achieve 100 percent path coverage.Testing every single unique path would allow us to find all bugs, right? Well,maybe not. We would still have failures due to different configurations, timingand race conditions, etc. LCSAJs might lead to a partial solution by findingsome defects we might otherwise have missed, but the additional cost for mar-ginal results might be problematic. Path coverage is likely to remain low, frustratingly so if you do the math.The standard way for determining coverage is to divide the entities tested by thefull number of entities there are, in this case the number of paths tested dividedby the full number of paths. Suppose we did a huge amount of testing resulting11. IPL Structure Testing.pdf, Information Processing Ltd. (IPL.com); it is available on IPL’swebsite.
    • 220 4 Test Techniques ISTQB Glossary basis test set: A set of test cases derived from the internal structure of a com- ponent or specification to ensure that 100% of a specified coverage criterion will be achieved. basis test set in testing 1 million paths. Since there are likely to be an infinite number of paths, our percentage of paths tested approaches zero. Not too encouraging... So, if full path coverage is not happening, and LCSAJ is too—shall we say— iffy, how can we get some practical usage out of path theory? Perhaps we could try to identify a practical subset of paths that would be interesting to test and give us some information we might not otherwise find. 4.3.2.2 Basis Path/Cyclomatic Complexity Testing One suggestion is to test the structure of the code to a reasonable amount. Thomas McCabe was instrumental is coming up with the idea of basis path testing. In December 1976, he published his paper “A Complexity Measure”12 in the IEEE Transactions on Software Engineering. He theorized that any software module has a small number of unique, independent paths (excluding iterations) through it. He called these basis paths. The theory states that the structure of the code can be tested by executing through this small number of paths and that all of the infinity of different paths are actually just using and reusing the basis paths. To try to get the terminology right, consider the following definition from Boris Beizer’s book, Software Testing Techniques: A path through the software is a sequence of instructions or statements that starts at an entry, junction, or decision and ends at another, or possi- bly the same, junction, decision or exit. A path may go through several junctions, processes or decisions one or more times. 12.http://portal.acm.org/citation.cfm?id=800253.807712 or http://en.wikipedia.org/wiki/Cyclomatic_complexity and click the second reference at the bot- tom.
    • 4.3 Structure-Based 221A basis path is defined as a unique independent path through the module—with no iterations or loops allowed. The basis path set is the smallest number ofbasis paths that cover the structure of the code. Creating a set of tests that cover these basis paths, the basis set, will guaran-tee us both statement and decision coverage of testing. The basis path has alsobeen called the minimal path for coverage. Cyclomatic complexity is what Thomas McCabe called this theory of basispaths. The term cyclomatic refers to an imaginary loop from the end of a mod-ule of code back to the beginning of it. How many times would you need tocycle through the loop until the structure of the code has been completely cov-ered? This cyclomatic complexity number is the number of loops needed, and—not coincidentally—the number of test cases we need to test the set of basispaths. The complexity depends not on the size of the module, but in the numberof decisions that are in it. McCabe, in his paper, pointed out that the higher the complexity, the morelikely there will be a higher bug count. Subsequent studies predominately haveshown such a correlation; modules with the highest complexity tend to alsocontain the highest number of defects. Since the more complex the code, theharder it is to understand and maintain, a reasonable argument can be made tokeep the level of complexity down. McCabe’s suggestion was to split larger,more-complex modules into smaller, less-complex modules. We can measure cyclomatic complexity by creating a directed control-flowgraph. This works well for small modules, and we will do that next. However, inreal life, tools are generally used to measure module complexity. In our control graph, we have nodes to represent entries, exits, and deci-sions. Edges represent non-branching code statements that do not add to thecomplexity. In general, the higher the complexity, the more test cases we need to coverthe structure of the code. Remember, if we cover the basis path set, we are guar-anteed both statement and decision coverage.
    • 222 4 Test Techniques EuclidGCD (int a, int b) Cyclomatic Complexity { if (a <= 0) C = #R + 1 return -1; C = 4 + 1 if (b <= 0) C = 5 return -1; int t; or if (b > a) { t = a; C = #E = #N + 2 a = b; C = 9 – 6 + 2 b = t; C = 5 } t = a % b; while (t != 0) { a = b; b = t; t = a % b; } return b ; } Figure 4–43 Cyclomatic complexity example On the left side of figure 4-43 we have a function to calculate the greatest com- mon divisor of two numbers using Euclid’s algorithm. The dotted lines from the code to the McCabe flow diagram in the center of the figure show how non- branching sequences of zero or more statements become edges (arrows) and how branching and looping constructs become nodes (bubbles). On the right side of the figure, you see the two methods of calculating McCabe’s Cyclomatic complexity metric. Seemingly the simplest is perhaps the “enclosed region” calculation. The four enclosed regions (R1, R2, R3, R4), repre- sented by R in the upper equation, are found in the diagram by noting that each decision (bubble) has two branches that, in effect, enclose a region of the graph. The other method of calculation involves counting the edges (arrows) and the nodes (bubbles) and applying those values to the calculation, E - N + 2, where E is the number of edges and N is the number of nodes. Now, this is all simple enough for a small, simple method like this. For larger functions, drawing the graph and doing the calculation from it can be really painful. So, a simple rule of thumb is this: Count the branching and loop- ing constructs and add 1. The if statements and the for, while, and do/while con- structs, each count as one. For the switch/case constructs, each case block counts as one. In if and ladder if constructs, the else does not count. For switch/case
    • 4.3 Structure-Based 223constructs, the default block does not count. This is a rule of thumb, but it usu-ally seems to work. In the sample code, there are three if statements and onewhile statement. 3 + 1 + 1 = 5. When we introduced McCabe’s theory of cyclomatic complexity a bit ago,we mentioned basis paths and basis tests. Figure 4-44 shows the basis paths andbasis tests on the right side. Basis PathsEuclidGCD (int a, int b) 1. ABF{ 2. ABCF if (a <= 0) 3. ABCDEF (1) return -1; 4. ABCDEF (2) if (b <= 0) 5. ABCDEEF return -1; int t; Basis Tests if (b > a) { 1. -5, 2 -1 t = a; 2. 2, -5 -1 a = b; 3. 16, 8 8 b = t; 4. 4, 8 4 } 5. 20, 8 4 t = a % b; while (t != 0) { a = b; b = t; t = a % b; } return b ;}Figure 4–44 Basis tests from directed flow graphThe number of basis paths is equal to the Cyclomatic complexity. You constructthe basis paths by starting with an arbitrary path through the diagram, fromentry to exit. Then add another path that covers a minimum number of previ-ously uncovered edges, repeating this process until all edges have been coveredat least once. Following the IEEE and ISTQB definition that a test case substantially con-sists of input data and expected output data, the basis tests are the inputs andexpected results associated with each basis path. Usually, the basis tests will cor-respond with the tests required to achieve branch coverage. This makes sensebecause complexity increases anytime more than one edge leaves from a node ina directed control-flow diagram. In this kind of a control-flow diagram, a situa-
    • 224 4 Test Techniques tion where “more than one edge from a node” represents a branching or looping construct where a decision is made. What use is this tidbit of information? Well, suppose you were talking to a programmer about their unit testing. You ask how many different sets of inputs they used. If they tell you a number that is less than the McCabe cyclomatic complexity metric for the code they are testing, it’s a safe bet they did not achieve branch coverage. That implies, as was mentioned earlier, that they did not cover the equivalence partitions. One more thing on cyclomatic complexity. Occasionally, you might hear someone argue that the actual formula for McCabe’s cyclomatic complexity is C=E-N+P whereas we list it as C = E - N + 2P A careful reading of the original paper does show both formulae. However, the directed graph that accompanies the former version of the formula (E - N + P) also shows that there is an edge drawn from the exit node to the entrance node that is not there when he uses the latter equation (E - N + 2P). This edge is drawn differently than the other edges—as a dashed line rather than a solid line, showing that while he is using it theoretically in the math proof, it is not actually there in the code. This extra edge is counted when it is shown and must be added when not shown (as in our example). The P value stands for the number of connected components.13 In our examples, we do not have any connected components, so by definition, P = 1. To avoid confusion, we abstract out the P and simply use 2 (P equal to 1 plus the missing theoretical line, connecting exit to the entrance node). The mathematics of this is beyond the scope of this book; suffice it to say, unless an example directed graph contains an edge from the exit node to the enter node, the formula that we used is correct. We’ll revisit cyclomatic complexity later when we talk about static analysis. 13. A connected component, in this context, would be a called sub-routine.
    • 4.3 Structure-Based 2254.3.2.3 Cyclomatic Complexity ExerciseThe following C code function loads an array with random values. 1. int main (int MaxCols, int Iterations, int MaxCount) 2. { 3. int count = 0, totals[MaxCols], val = 0; 4. 5. memset (totals, 0, MaxCols * sizeof(int)); 6. 7. count = 0; 8. if (MaxCount > Iterations) 9. { 10. while (count < Iterations) 11. { 12. val = abs(rand()) % MaxCols; 13. totals[val] += 1; 14. if (totals[val] > MaxCount) 15. { 16. totals[val] = MaxCount; 17. } 18. count++; 19. } 20. } 21. return (0); 22. }1. Create a directed control-flow graph for this code.2. Using any of the methods given in the preceding section, calculate the cyclomatic complexity.3. List the basis tests that could be run.The answers are in the following section.4.3.2.4 Cyclomatic Complexity Exercise DebriefThe directed control-flow graph should look like figure 4-45. Note that theedges D1 and D2 are labeled; D1 is where the if conditional in line 14 evaluatesto TRUE, D2 is where it evaluates to FALSE. We could use three different ways to calculate the cyclomatic complexity ofthe code as shown in the box on the right in figure 4-45.
    • 226 4 Test Techniques C = #R + 1 Entry A C=3+ 1 Line 1 C=4 or C = #E + #N + 2 C=7–5+2 C=4 or if () B C = # Conditionals + 1 Line 8 C=3+ 1 C=4 while () if () C D Line 10 Line 14 D1 D2 Exit Line 22 E Figure 4–45 Cyclomatic complexity exercise First, we could calculate the number of test cases by the region methods. Remember, a region is an enclosed space. The first region can be seen on the left side of the image. The curved line goes from B to E; it is enclosed by the nodes (and edges between them) B-C-E. The second region is the top edge that goes from C-D and is enclosed by line D1. The third is the region with the same top edge, C-D, and is enclosed by D2. Here is the formula: C = # Regions + 1 C=3+1 C=4 The second way to calculate cyclomatic complexity uses McCabe’s cyclomatic complexity formula. Remember, we count up the edges (lines between bubbles) and the nodes (the bubbles themselves) as follows: C=E-N+2 C=7-5+2 C=4
    • 4.3 Structure-Based 227Finally, we could use our rule of thumb measure, which usually seems to work.Count the number of places where decisions are made and add 1. So, in thecode itself, we have line 8 (an if() statement), line 10 (a while() loop), and line 14(an if() statement): C = # decisions + 1 C=3+1 C=4In each case, our basis path is equal to 4. That means our basis set of tests wouldalso number 4. The following test cases would cover the basis paths:1. ABE2. ABCE3. ABCD(D1)CE4. ABCD(D2)CE4.3.3 A Final Word on Structural TestingBefore we leave structural testing, a final word: Boris Beizer wrote a pithy argu-ment seemingly against doing white-box structure-based testing in his bookSoftware Testing Techniques, first published in 1990. We think that he canexpress it better than we can, so it is included here: Path testing is more effective for unstructured rather than structured code. Statistics indicate that path testing by itself has limited effectiveness for the following reasons: ■ Planning to cover does not mean you will cover – especially when there are bugs contained. ■ It does not show totally wrong or missing functionality. ■ Interface errors between modules will not show up in unit testing. ■ Database and data-flow errors may not be caught. ■ Incorrect interaction with other modules will not be caught in unit testing. ■ Not all initialization errors can be caught by control-flow testing. ■ Requirements and specification errors will not be caught in unit testing.
    • 228 4 Test Techniques After going through all of those problems with structural testing, Beizer goes on to say: Creating the flowgraph, selecting a set of covering paths, finding input data values to force those paths, setting up the loop cases and combina- tions – it’s a lot of work. Perhaps as much work as it took to design the routine and certainly more work than it took to code it. The statistics indicate that you will spend half your time testing it and debugging – presumably that time includes the time required to design and docu- ment test cases. I would rather spend a few quiet hours in my office doing test design than twice those hours on the test floor debugging, going half deaf from the clatter of a high-speed printer that’s producing massive dumps, the reading of which will make me half blind. Further- more, the act of careful, complete, systematic, test design will catch as many bugs as the act of testing.…The test design process, at all levels, is at least as effective at catching bugs as is running the test designed by that process. 4.3.4 Structure-Based Testing Exercise Using the code in figure 4-46, answer the following questions: 1. How many test cases are needed for basis path coverage? 2. If we wanted to test this module to the level of multiple condition coverage (ignoring the possibility of short circuiting), how many test cases would we need? 3. If this code were in a system that was subject to FAA DO/178B and was rated at Level A criticality, how many test cases would be needed for the first if() statement alone? 4. To achieve only statement coverage, how many test cases would be needed?
    • 4.3 Structure-Based 229 1. void ObjTree::AddObj(const Obj& w) { 2. // Make sure we want to store it 3. if (!(isReq() && isBeq() && (isNort()||(isFlat() && isFreq())))){ 4. return; 5. } 6. // If the tree is currently empty, create a new one 7. if (root == 0) { 8. // Add the first obj. 9. root = new TObjNode(w); 10. } else { 11. TObjNode* branch = root; 12. while (branch != 0) { 13. Obj CurrentObj = branch->TObjNodeDesig(); 14. if (w < CurrentObj) { 15. // Obj is new or lies to 1eft of the current node. 16. if (branch->TObjNodeSubtree(LEFT) == 0) { 17. TObjNode* NewObjNode = new TObjNode(w); 18. branch->TObjNodeAddSubtree(LEFT, NewObjNode); 19. break; 20. } else { 21. branch = branch->TObjNodeSubtree(LEFT); 22. } 23. } else if (CurrentObj < w) { 24. // Obj is new or lies to right of the current node. 25. if (branch->TObjNodeSubtree(RIGHT) == 0) { 26. TObjNode* NewObjNode = new TObjNode(w); 27. branch->TObjNodeAddSubtree(RIGHT, NewObjNode); 28. break; 29. } else { 30. branch = branch->TObjNodeSubtree(RIGHT); 31. } 32. } else { 33. // Found match, so bump the counter and end the loop. 34. branch->TObjNodeCountIncr(); 35. break; 36. } 37. } // whi1e 38. } // if 39. return; 40. }Figure 4–46 Code for structure-based exercise4.3.5 Structure-Based Testing Exercise Debrief1. How many test cases are needed for basis path coverage?The directed control-flow graph would look something like figure 4-47. It looksa bit messy, but if you look closely at it, it does cover the code. The number of test cases we need for cyclomatic complexity can be calcu-lated in three different ways. Well maybe. The region method clearly is prob-lematic because of the way the graph is drawn. This is a common problem when
    • 230 4 Test Techniques drawing directed control-flow graphs; there is a real art to it, especially when working in a timed atmosphere. Entry A Line 1 if() T B Line 3 F E G if () C if() else if() T Line 14 F F Line 7 Line 23 F T T F H D if() if() while() F F Line 16 Line 25 F Line 12 T T T J Exit Line 40 Figure 4–47 Structure-based testing exercise directed control-flow graph Let’s try McCabe’s cyclomatic complexity formula: C=E-N+2 C = 15 - 9 + 2 C=8 Alternately, we could try the rule of thumb method for finding cyclomatic com- plexity: count the decisions and add one. There are decisions made in the fol- lowing lines: Line 3 : if Line 7: if (Remember, the else in line 10 is not a decision.) Line 12: while
    • 4.3 Structure-Based 231 Line 14: if Line 16: if Line 23: else if (This is really just an if, often called a ladder if con- struct.) Line 25: ifTherefore, the calculation is as follows: C=7+1 C=8All this means we have eight different test cases as follows: 1. ABJ 2. ABCJ 3. ABCDEFJ 4. ABCDEFDEFJ 5. ABCDEGHJ 6. ABCDEGHDEGHJ 7. ABCDEGJ 8. ABCDJNotice that test 8 is a bit of a problem. If you look closely at the code, it cannothappen. Famous last words for a tester, right? It should not ever happen. Wewould have to use a debugger to test it by changing the value of branch to 0.2. If we wanted to test this module to the level of multiple condition coverage(ignoring the possibility of short circuiting), how many test cases would we need?The first conditional statement looks like this: (! (isReq() && isBeq() && (isNort() || (isFlat() && isFreq()))))Each of these functions would likely be a private method of the class we areworking with (ObjTree). We can rewrite the conditional to make it easier tounderstand. (! (A && B && (C || (D && E)))
    • 232 4 Test Techniques A good way to start is to come up with a truth table that covers all the possibili- ties that the atomic conditions can take on. With five atomic conditions, there are 32 possible combinations, as shown in table 4-38. Of course, you could just calculate that by calculating 25. But since we want to discuss how many test cases we would need if this code was written in a language that did short-circuit Boolean evaluations (it is!), we’ll show it here. Table 4–38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 A T T T T T T T T T T T T T T T T F F F F F F F F F F F F F F F F B T T T T T T T T F F F F F F F F T T T T T T T T F F F F F F F F C T T T T F F F F T T T T F F F F T T T T F F F F T T T T F F F F D T T F F T T F F T T F F T T F F T T F F T T F F T T F F T T F F E T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F Thirty-two separate test cases would be needed. Note that 16 of these test cases evaluate to TRUE. This evaluation would then be negated (see the negation operator in the code). So 16 test cases would survive and move to line 7 of the code. We would still need to achieve decision coverage on the other condition- als in the code (more on that later). While the exercise does not require us to figure out the number of tests required in case of short circuiting, it is an interesting question. In table 4-39 are the test cases we would need for the first if() statement (line 3) to achieve cover- age. Table 4–39 A B C D E Expression Eval 1. FALSE - - - - FALSE 2. TRUE FALSE - - - FALSE 3. TRUE TRUE TRUE - - TRUE 4. TRUE TRUE FALSE FALSE - FALSE 5. TRUE TRUE FALSE TRUE TRUE TRUE 6. TRUE TRUE FALSE TRUE FALSE FALSE Test 1: If the first term goes FALSE, the other terms do not affect anything. Test 2: If the second term goes FALSE while the first term is TRUE, the others do not matter.
    • 4.3 Structure-Based 233Test 3: If the third term goes TRUE when A and B are TRUE, C and D do not matter.Test 4: When A and B are TRUE, if C and D are FALSE, E does not matter.Test 5: No short circuit is possible.Test 6: No short circuit is possible.Only two of these test cases evaluate to TRUE. Remember, this value getsnegated, so only these two tests would survive to continue on to line 7. That willnot be enough testing to get decision coverage throughout the rest of the code.We will need to add test cases to achieve that level; likely we would want to testsome other combinations of atomic conditions just for kicks.Test 1: Shown in table 4-39 as test 1. [ABJ]Test 2: Shown in table 4-39 as test 2. [ABJ]Test 3: Shown in table 4-39 as test 4. [ABJ]Test 4: Shown in table 4-39 as test 6. [ABJ]Test 5: Assume that the values from test 3 also have (root == 0) so the test ends immediately after creating a new ObjNode in line 9. [ABCJ]Test 6: Assume that the values from test 5 are included when root != 0. Further assume that the object is new and belongs on the left side (w < Curren- tObj). After creating a new ObjNode and populating it, test ends. [ABCDEFJ]Test 7: Assume that you pick a set of values for the first if (line 3) that causes the expression to trigger FALSE. Root != 0. Obj lies to the left and down one level and is not currently in the tree. [ABCDEFDEFJ]Test 8: Assume that the values from test 5 are included when root != 0. Further assume that the object is new and belongs on the right side (w > Cur- rentObj). After creating a new ObjNode and populating it, test ends. [ABCDEGHJ]Test 9: Assume that you pick a set of values for the first if (line 3) that causes the expression to trigger FALSE. Root != 0. Obj lies to the right and down one level and is not currently in the tree. [ABCDEGHDEGHJ]
    • 234 4 Test Techniques Test 10: Assume that you pick a set of values for the first if (line 3) that causes the expression to trigger FALSE. Root != 0. Object is the first object tested and so the counter is incremented. [ABCDEGJ] Note: With 10 tests, we do achieve multiple-condition-level coverage (assuming that the compiler writes code to short-circuit the testing); however, we do not have loop coverage for the while() on line12 because it cannot execute 0 times. At the minimum, we would also want to test the loop a lot of times (i.e., a very large binary tree where the object does not already exist). 3. If this code were in a system that was subject to FAA DO/178B and was rated at Level A criticality, how many test cases would be needed for the first if() statement alone? The standard that we must meet mentions five different levels of criticality (see table 4-40). Table 4–40 Criticality Required Coverage Level A: Catastrophic Modified condition/decision, decision and statement Level B: Hazardous/Severe Decision and statement Level C: Major Statement Level D: Minor None Level E: No effect None This means that we must achieve MC/DC-level coverage for the first if() state- ment. Note that there are no other compound conditional statements in the other decisions, so we can ignore them; decision coverage will cover them. To achieve MC/DC coverage, we must ensure the following: A. Each atomic condition is evaluated both ways (T/F). B. Decision coverage must be satisfied. C. Each atomic condition must be able to affect the outcome independently while other atomic conditions are held without changing. The expression we are concerned with follows: (! (A && B && (C || (D && E)))
    • 4.3 Structure-Based 235We will start out by ignoring the first negation, shown by the exclamationmark at the beginning. We don’t know about you, but inversion logic alwaysgives us a headache. Since we are inverting the entire expression (check out theparentheses to see that), we can just look at the expression. First, let’s figureout how to make sure that each atomic condition can affect the outcome inde-pendently.Table 4–41 A B C D E Expression Eval 1. F T T T T FALSE 2. T F T T T FALSE 3. T T F F T FALSE 4. T T F T F FALSE 5. T T F T T TRUETest 1: With these values, A controls the output independently. If it went TRUE, the output would toggle.Test 2: With these values, B controls the output independently. If it went TRUE, the output would toggle.Test 3: With these values, C controls the output independently. If it goes TRUE, the output would toggle.Test 4: With these values, E controls the output independently. If it goes TRUE, the output would toggle.Test 5: With these values, D controls the output independently. If it goes FALSE, the output would toggle.With these five tests, we can meet all three objectives (for the first if() state-ment). Among the five tests, each atomic condition is tested both ways. Deci-sion coverage is achieved by test 5 and any other single test. And in each case, asshown, each atomic condition can affect the output independently.4. To achieve only statement coverage, how many test cases would be needed?Okay, so this one is relatively simple! You could trace through the code anddetermine the number. Or, you could notice that the way it is written, statementcoverage is going to be the same as decision coverage. How do we know that?
    • 236 4 Test Techniques The first if() (line 3) needs to be tested both ways to get statement coverage. When it’s TRUE, it does a quick return, which must be executed. When it’s FALSE, we get to go further in the code. The second if() (line 7) and third (line 14), fourth (line 16), and sixth (line 25) each has a statement in both the TRUE and FALSE directions. The else if() (line 23) is the else for the if() (line 14) and has an else of its own, both of which have code that needs to execute for statement coverage. The only confusing piece to this answer is the while() (line 12). The way the code is written, it should not allow it to ever evaluate to FALSE. That means we would have to use a debugger to change the value of branch to test it. Since we have already gone through getting decision coverage in answer 3, we will accept that answer. 4.4 Defect- and Experience-Based Learning objectives (K2) Describe the principle and reasons for defect-based techniques and differentiate its use from specification- and structure-based techniques. (K2) Explain by examples defect taxonomies and their use. (K2) Understand the principle of and reasons for experience-based techniques and when to use them. (K3) Specify, execute, and report tests using exploratory testing. (K3) Specify tests using the different types of software fault attacks according to the defects they target. (K4) Analyze a system in order to determine which specification- based, defect-based, or experience-based techniques to apply for specific goals. Now we’re going to move from systematic techniques for test design into less- structured but nonetheless useful techniques. We start with defect-based and defect-taxonomy-based techniques. Conceptually, we are doing defect-based testing anytime the type of the defect sought is the basis for the test. Usually, the underlying model is some list
    • 4.4 Defect- and Experience-Based 237 ISTQB Glossary defect-based technique: A procedure to derive and/or select test cases tar- geted at one or more defect categories, with tests being developed from what is known about the specific defect category. experience-based technique: Procedure to derive and/or select test cases defect-based technique based on the tester s experience, knowledge and intuition. experience-based tech- niqueof defects seen in the past. If this list is organized as a hierarchical taxonomy,then the testing is defect taxonomy based. To derive tests from the defect list or the defect taxonomy, we create testsdesigned to reveal the presence of the defects in the list. Now, for defect-based tests, we tend to be more relaxed about the concept ofcoverage. The general criterion is that we will create a test for each defect type,but it is often the case that the question of whether to create a test at all is riskweighted. In other words, if the likelihood or impact of the defect doesn’t justifythe effort, don’t do it. However, if the likelihood or impact of the defect werehigh, then you would create not just one test, but perhaps many. This should bestarting to sound familiar to you, yes? The underlying bug hypothesis is that programmers tend to repeatedlymake the same mistakes. In other words, a team of programmers will introduceroughly the same types of bugs in roughly the same proportion from oneproject to the next. This allows us to allocate test design and execution effortbased on the likelihood and impact of the bugs.4.4.1 Defect TaxonomiesTable 4-42 shows an example of a defect taxonomy. It occurs in Rex’s bookManaging the Testing Process but was originally derived from Boris Beizer’staxonomy in Software Testing Techniques.
    • 238 4 Test Techniques ISTQB Glossary defect taxonomy: A system of (hierarchical) categories designed to be a use- ful aid for reproducibly classifying defects. Table 4–42 • Functional • Data – Specification – Type – Function – Structure – Test – Initial Value • System – Other – Internal Interface • Code – Hardware Devices • Documentation – Operating System • Standards – Software Architecture • Other – Resource Management • Duplicate • Process • Not a Problem – Arithmetic • Bad Unit – Initialization • Root Cause Needed – Control or Sequence • Unknown – Static Logic – Other There are eight main categories along with five “bookkeeper” categories that are useful when classifying bugs in a bug tracking system. Let’s go through these in detail. In the category of Functional defects, there are three subcategories: ■ Specification: The functional specification—perhaps in the requirements document or in some other document—is wrong. ■ Function: The specification is right, but the implementation of it is wrong. ■ Test: Upon close research, we found a problem in test data, test designs, test specifications, or somewhere else. In the category of System defects, there are five subcategories: ■ Internal Interface: The internal system communication fails; in other words, there is an integration problem of some sort internal to the test object.
    • 4.4 Defect- and Experience-Based 239■ Hardware Devices: The hardware that is part of the system or that hosts the system fails.■ Operating System: The operating system—which presumably is external to the test object—fails.■ Software Architecture: Some fundamental design assumption proves invalid, such as an assumption that data could be moved from one table to another or across some network in some constant period.■ Resource Management: The design assumptions were okay, but some implementation of the assumption was wrong; for example, the design of the data tables introduces delays.In the category of Process defects, there are five subcategories:■ Arithmetic: The software does math wrong. This doesn’t mean just basic math; it can also include sophisticated accounting or numerical analysis functions, including problems that occur due to rounding and precision issues.■ Initialization: An operation fails on its first use, when there are no data in a list, and so forth.■ Control or Sequence: An action occurs at the wrong time or for the wrong reason, like say, seeing screens or fields in the wrong order.■ Static Logic: Boundaries are misdefined, equivalence classes don’t include the right members and exclude the wrong members, and so forth.■ Other: A control-flow or processing error that doesn’t fit in the preceding categories has occurred.In the category of Data defects, there are four subcategories:■ Type: The wrong data type—whether a built-in or user-defined data type— is used.■ Structure: A complex data structure or type is invalid or inappropriately used.■ Initial Value: A data element’s initialized value is incorrect, like a list of quantities to purchase that defaults to zero rather than one.■ Other: A data-related error occurs that doesn’t fit in the preceding buckets.The category of Code applies to some simple typo, misspelling, stylistic error, orother coding error occurring and resulting in a failure. Theoretically, these
    • 240 4 Test Techniques couldn’t get past a compiler, but in these days of scripting on browsers, this stuff does happen. The category of Documentation applies to situations where the documenta- tion says the system does X on condition Y, but the system does Z—a valid and correct action—instead. The category of Standards applies to situations where the system fails to meet industry, governmental, or vendor standards or regulations or to follow coding or user interface standards or to adhere to naming conventions, and so forth. Other: The root cause is known, but fits none of the preceding categories, which should be rare if this is a useful taxonomy. The five housekeeping categories are as follows: ■ Duplicate: You find that two bug reports describe the same bug. ■ Not a Problem: The behavior noted is correct. The report arose from a misunderstanding on the part of the tester about correct behavior. This situation is different than a test failure because this occurs during test execution and is an issue of interpretation of an actual result. ■ Bad Unit: The bug is a real problem, but it arises from a random hardware failure that is unlikely in the production environment. ■ Root Cause Needed: Applies when the bug is confirmed as closed by test but no one in development has supplied a root cause. ■ Unknown: No one knows what is broken. Ideally, this applies to a small number of reports, generally when an intermittent bug doesn’t appear for quite awhile, leading to a conclusion that some other change fixed the bug as a side effect. Notice that this taxonomy is focused on root cause, at least in the sense of what ultimately proved to be wrong in the code that caused the failure. You could also have a “process root cause” taxonomy that showed at which phase in the development process that bug was introduced. Notice that such a taxonomy would be useless to us for designing tests. Even the root-cause- focused taxonomy can be a bit hard to use for purposes of test design. One concern that we always have when dealing with taxonomies: the cate- gories that say other. Like a backyard swimming pool without a fence, these are
    • 4.4 Defect- and Experience-Based 241trouble waiting to happen. We have seen cases where an analysis of a taxonomyshowed more than 80 percent of the issues to be set to other. For a taxonomy to work, the users must be educated to think through eachissue thoroughly before assigning it to the other bucket. When Jamie wrote a testmanagement tool, he would allow the other category to be chosen only when anassociated field was filled with an explanation of why the issue did not fit othercategories. If there is no training, education, and mentoring on the use andimportance of categorizing issues correctly in a taxonomy, a taxonomy candegrade to a simple waste of time. Table 4-43 shows an example of a bug taxonomy—a rather coarse-grainedone—gathered from the Internet appliance case study we’ve looked at a fewtimes in this book. This one is focused on symptoms. Notice that this makes iteasier to think about the tests we might design. However, its coarse-grainednature means that we’ll need to have additional information—lists of desiredfunctional areas, usability and user interface standards and requirements, reli-ability specifications, and the like—to use this to design tests. We’ve also showed the percentage of bugs we actually found in each cate-gory during system test.Table 4–43 Description Count % Failed functionality 425 46% Missing functionality 179 19% Poor usability 106 11% Build failed 62 7% Bad system design/architecture 51 5% Reliability problem 49 5% Data loss 18 2% Slow performance 16 2% Code obsolete 16 2% Deviation from specification 8 1% Bad user documentation 2 0% Total 932 100%
    • 242 4 Test Techniques ISTQB Glossary error guessing: A test design technique where the experience of the tester is used to anticipate what defects might be present in the component or system under test as a result of errors made and to design tests specifically to expose them. 4.4.2 Error Guessing Error guessing is a term introduced by Glenford Myers.14 Conceptually, error guessing involves the tester taking guesses about a mistake that a programmer might make and then developing tests for it. Notice that this is what might be a called a “gray-box” test since it requires the tester to have some idea about typical programming mistakes, how those mistakes become bugs, how those bugs manifest themselves as failures, and how we can force fail- ures to happen. Now, if error guessing follows an organized hierarchical taxonomy—like defect-taxonomy-based tests, then the taxonomy is the model. The taxonomy also provides the coverage criterion, if it is used, because we again test to the extent appropriate for the various elements of the taxonomy. Usually, the error guessing follows mostly from tester inspiration. As you can tell so far, the derivation of tests is based on the tester’s intuition and knowledge about how errors (the programmer’s or designer’s mistake) become defects that manifest themselves as failures if you poke and prod the system in the right way. Now, error guessing tests can be part of an analytical, predesigned, scripted test procedure. However, they often aren’t, but are added to the scripts (ideally) or used instead of scripts (less ideal) at execution time. The underlying bug hypothesis is very similar to that of defect-taxonomy- based tests. Here, though, we not only count on the programmer to make the same mistakes, we also count on the tester to have seen bugs like the ones in this system before and to remember how to find them. At this point, you’re probably thinking back to our earlier discussion about quality risk analysis. Some of these concepts sound very familiar, don’t they? 14. You can find this technique and others we have discussed in the first book on software testing, The Art of Software Testing by Glenford Myers.
    • 4.4 Defect- and Experience-Based 243 Certainly, error guessing is something that we do during quality risk analy-sis. It’s an important part of determining the likelihood. Perhaps we as test ana-lysts don’t do it, but rather we rely on the developers to do it for us. Or maybewe participate with them. Depends on our experience, which is why errorguessing is an experience-based test technique. Maybe a bug taxonomy is a form of risk analysis? It has elements of a qualityrisk analysis, in that it has a list of potential failures and frequency ratings. How-ever, it doesn’t always include impact. Even if a bug taxonomy does include impact ratings for potential bugs, it’sstill not quite right. Remember that the way we described it, a quality risk anal-ysis is organized around quality risk categories or quality characteristics. A bugtaxonomy is organized around categories of failures or root causes rather thanquality risk categories or quality.4.4.3 Checklist TestingWith checklist testing, we now get into an area that is often very much like qual-ity risk analysis in its structure. Conceptually, the tester takes a high-level list ofitems to be noted, checked, or remembered. What is important for the system todo? What is important for it not to do? The checklist is the model for testing, and the checklist is usually organizedaround a theme. The theme can be quality characteristics, user interface stan-dards, key operations, or any other theme you’d like to pick. To derive tests froma checklist, either during test execution time or during test design, you createone to evaluate the system per the test objectives of the checklist. Now, we can specify a coverage criterion, which is that there be at least onetest per checklist item. However, the checklist items are often very high-levelitems. By very high-level, we don’t mean high-level test case. A high-level orlogical test case gives rules about how to generate the specific inputs andexpected results but doesn’t necessarily give the exact values as a low-level orconcrete test case would. The level of a test checklist is even higher than that. It will give you a list ofareas to test, characteristics to evaluate, general rules for determining test passor failure, and the like. You can develop and execute one set of tests, whileanother competent tester might cover the same checklist differently. Notice that
    • 244 4 Test Techniques this is not true for many of the specification-based techniques we covered, where there was one right answer. The underlying bug hypothesis in checklist testing is that bugs in the areas of the checklist are either likely, important, or both. Notice again that much of this harkens back to risk-based testing. But note that, in some cases, the check- list is predetermined rather than developed by an analysis of the system. As such, checklist testing is often the dominant technique when the testing follows a methodical test strategy. When testing follows an analytical test strategy, the list of test conditions is not a static checklist but rather is generated at the begin- ning of the project and periodically refreshed during the project, through some sort of analysis, such as quality risk analysis. Table 4-44 is an example of a checklist, this one for reviewing code15. Well see more of these types of checklists in chapter 6 when we discuss reviews. Table 4–44 • Declarations – Are literal constants correct? – Are all variables always initialized, especially in changed code? – Are unsigned integers used when signed should be? • Data Items – Are all strings null-terminated—especially after manipulation? – Are buffer size checks always done when manipulating them? – Are bitfield manipulations portable to other architectures? – Is sizeof() called with a pointer rather than the object pointed to? Under declarations we ask the following questions: ■ Are all the literal constants correct? Are we using magic numbers that might change in the future? Are we using them in multiple places? Should we change those to declared constant values? ■ Have we ensured that every single variable has been initialized, especially when we are in maintenance mode and have added more or changed existing variables. ■ Are our data types correct? Are our variables big enough? Might there be a good reason to use signed over unsigned (or vice versa)? 15. This particular checklist comes from Brian Maricks book The Craft of Software Testing.
    • 4.4 Defect- and Experience-Based 245 ISTQB Glossary exploratory testing: An informal test design technique where the tester actively controls the design of the tests as those tests are performed and uses information gained while testing to design new and better tests.Under data items, we might ask these questions:■ Are all strings handled correctly? If explicit buffers are used for them, are they big enough? If null-terminated, do we manipulate them directly such that the null-value is misplaced or lost?■ Is every buffer usage checked for overrun? The hackers and crackers will be checking your buffers. Did you?■ When dealing with bitfields, is the code going to be portable? Does word size matter when manipulating the bits? Will they work in both big-endian and small-endian systems?■ Have you used sizeof() correctly? If a pointer is passed in rather than a reference, this function is going to return the size of the pointer.Okay, so how do we use this for testing? Well, imagine going through the code,methodically, checking off each of these main areas. If you see problems, youreport a bug against this heuristic for the module where you saw the problem.Notice the subjectivity involved in the evaluation of test pass or fail results.4.4.4 Exploratory TestingExploratory testing, like checklist testing, often follows some basic guidelines,like a checklist, and often relies on tester judgment and experience to evaluatethe test results. However, exploratory testing is inherently more reactive, moredynamic, in that most of the action with exploratory testing has to happen dur-ing test execution. Conceptually, exploratory testing is happening when a tester is simulta-neously learning the system, designing tests, and then executing those tests. Theresults of each test, largely, determine what we test next. Now, that is not to say that this testing is random or driven entirely byimpulse or instinct. For example, we could use a list of usability heuristics notonly as a preplanned checklist, but also as a heuristic to guide us to important or
    • 246 4 Test Techniques problematic software areas. The best kinds of exploratory testing usually do have some type of model, either written or mental. According to this model, we derive tests by thinking how best to explore some area of testing interest. In some cases, to keep focus and provide some amount of structure, the test areas to be covered are constrained by a test char- ter. As with checklist testing, we can specify a coverage criterion, which is that there be at least one test per charter (if we have them). However, the charters, like checklist items, are often very high level. The underlying bug hypothesis is that the system will reveal buggy areas during test execution that would be hidden during test basis analysis. In other words, you can only learn so much about how a system behaves and misbehaves by reading about it. This is, of course, true. It also suggests an important point, which is that exploratory testing, and indeed experience-based tests in general, make a good blend with scripted tests because they offset each other’s weak spots. The focus of testing in the ISTQB Foundation and Advanced syllabi is primarily analytical; that is, following an analytical test strategy. Analytical strategies are good at defect prevention, risk mitigation, and structured coverage. Experience-based tests usually follow a dynamic or reactive test strategy. One way that analytical and dynamic strategies differ is in the process of test- ing. The exploratory testing process, unlike the ISTQB Fundamental testing process, is very much focused on taking the system as we find it and going from there. The tester simultaneously learns about the product and its defects, plans the testing work to be done (or, if she has charters, adjusts the plan), designs and executes the tests, and reports the results. Now, when doing exploratory testing, it’s important not to degrade into frenzied, aimless keyboard-pounding. Good exploratory tests are planned, interactive, and creative. As we mentioned, the test strategy is dynamic, and one manifestation of this is that the tester dynamically adjusts test goals during execution.
    • 4.4 Defect- and Experience-Based 247 ISTQB Glossary test charter: A statement of test objectives, and possibly test ideas about how to test. Test charters are used in exploratory testing.4.4.4.1 Test ChartersBecause there is so much activity happening during exploratory test execu-tion—activity that, in an analytical strategy, happens before execution—as youcan imagine some trade-offs must occur. One of those is in the extent of docu-mentation of what was tested, how that related to the test basis, and what theresults were. The inability to say what was tested during exploratory testing, to anydegree of accuracy, has long been seen as its Achilles heel. The whole confi-dence-building objective of testing is undermined if we can’t say what we’vetested and how much we’ve tested it as well as being able to say what we haven’tyet tested. Notice that analytical test strategies do a good job of this, at leastwhen traceability is present. Another issue that comes up anytime we have multiple test analysts work-ing concurrently on the same test object is redundancy; that is, to what extentmultiple analysts are testing the exact same test items. One way people have come up with to reduce these problems with explor-atory testing is to use test charters. A test charter specifies the tasks, objectives,and deliverables, but in very few words. The charters can be developed well inadvance—even, in fact, based on analysis of risks as we have done for some cli-ents—or they can be developed just immediately before test execution startsand then continually adjusted based on the test results. Right before test execution starts for a given period of testing, exploratorytesting sessions are planned around the charters. These plans are not formallywritten following IEEE 829 or anything like that. In fact, they might not be writ-ten at all. However, consensus should exist between test analyst and test man-ager about the following:■ What the test session is to achieve■ Where the test session will focus■ What is in and out of scope for the session
    • 248 4 Test Techniques ■ What resources should be used, including how long it should last (often called a timebox) Now, given how lightweight the charters are—as you‘ll see in a moment—you might expect that the tester would get more information. And, indeed, the char- ters can be augmented with defect taxonomies, checklists, quality risk analyses, requirements specifications, user manuals, and whatever else might be of use. However, these augmentations are to be used as reference during test execution rather than as objects of analysis—i.e., as a test basis—prior to the test execution starting. Exploratory Testing Session Log Tester name ___________________ Da te __________ ______ Time on-task ___________________ C harter completed _______________ Charter Test the security of the login page. See if it is possible to log in without a password. Bugs reported 937 - Log in form vulnerable to SQL injection. 939 - System identifies a valid user name when the password is wrong. Follow-up issues * Lockout feature on three unsuccessful login attempts does not seem to work. Figure 4–48 In figure 4-48 we see an example of how this works. This replicates an actual exploratory testing session log from a project we did for a client recently. We were in charge of running an acceptance test for the development organization on behalf of the customers. If that sounds like a weird arrangement, by the way, yes, it was. At the top of the log, we have captured the tester’s name and when the test was run. It also includes a log of how much time was spent and whether the tester believes that the charter was completely explored—remember, this is sub- jective, as with the checklists.
    • 4.4 Defect- and Experience-Based 249 Below the heading information you see the charter. Now, it’s not unusual tohave a test procedure called “test login security,” or even a sentence like this inthe description of a test procedure or listed as a condition in a test design speci-fication. However, understand that this is all there is. There are no further detailsspecified in the test for the tester. You see how we are relying very heavily on thetester’s knowledge and experience? Under the charter section are two sections that indicate results. The firstlists the bugs that were found. The numbers correspond to numbers in a bugtracking system. The second lists issues that need follow-up. These can be—asthey are here—things that might be bugs but that we didn’t have time to finishresearching. Or, it could indicate situations where testing was incomplete due toblockages. In that case, of course, we’d expect that the charter would not becomplete.4.4.4.2 Exploratory Testing ExerciseConsider the use of exploratory testing on the HELLOCARMS project.The exercise consists of four parts:1. Identify a list of test charters for testing an area.2. Document your assumption about the testers who will use the charters.3. Assign a priority to each charter and explain why.4. Explain how you would report results.As always, check your work on the preceding part before proceeding to the nextpart. The solutions are shown in the next section.4.4.4.3 Exploratory Testing Exercise DebriefWe selected requirements element 010-010-170, which has to do with allowingapplications over the Internet. Support the submission of applications via the Internet, which includes the capability of untrained users to properly enter applications.We read that to mean we should use a mix of PC configurations, security set-tings, connection speeds, customer personas, and existing customer relation-ships to test applications over the Internet. Let’s see what that might look like.
    • 250 4 Test Techniques First, we would use pairwise techniques16 to generate a set of target PC configu- rations, including browser brand and version, operating system, and connection speeds. We would build these configurations prior to test execution, storing drive images for a quick restore during testing. Next, we would create a list of customer personas. Personas refers to the habits that a customer exhibits and experience that a customer has: ■ Nervous customer: Uses Back button a lot, revises entries, has long think time ■ Novice Internet user: Makes a lot of data input mistakes, has long think time ■ Power user: Types quickly, makes few mistakes, uses copy-and-paste from other PC applications (e.g., account numbers), has very short think time ■ Impatient customer: Types quickly, makes many mistakes, has very short think time, hits Next button multiple times Now we would create a list of existing customer banking relationship types: ■ Limited accounts, none with Globobank ■ Limited accounts, some with Globobank ■ Limited accounts, all with Globobank ■ Extensive accounts, none with Globobank ■ Extensive accounts, some with Globobank ■ Extensive accounts, all with Globobank Notice that these two lists allow a lot of tester latitude and discretion. Notice also that for the existing customer banking relationships, as with the PC config- urations, it would again make a lot of sense for the tester to create this customer data before test execution started. We would create some semiformal rules to cover testing with the test char- ters, as shown in table 4-45. 16. These are covered in Advanced Software Testing Vol. 1 and are required for the ISTQB Advanced Test Analyst certification, but not for the ISTQB Advanced Technical Test Analyst cer- tification.
    • 4.4 Defect- and Experience-Based 251Table 4–45 General rules for test charters: • For each of the following charters, restore your test PC to a previously untested PC configuration prior to starting the charter. Make sure each configuration is tested at least once. • For each of the following charters, select a persona. Make sure each persona is tested at least once. • For each of the following charters, select an existing customer banking relationship type. Make sure each customer banking relationship type is tested at least once. • Allocate 30–45 minutes for each application; thus, each charter is 30–120 minutes long.We would then write our charters as shown in table 4-46.Table 4–46 Charters: 1. Test successful applications with both limited and extensive banking relationships, where customer declines insurance. 2. Test a successful application where customer accepts insurance. 3. Test a successful application where the system declines insurance. 4. Test a successful application where property value escalates application. 5. Test a successful application where loan amount escalates application. 6. Test an application that was unsuccessful due to credit history. 7. Test an application that was unsuccessful due to insufficient income. 8. Test an unsuccessful application due to excessive debt. 9. Test an unsuccessful application due to insufficient equity. 10. Test cancellation of an application from all possible screens (120 minutes). 11. Test a fraudulent application, where material information provided by customer does not match decisioning mainframe’s data. 12. Test a fraudulent application, where material information provided by customer does not match LoDoPS data.Yes, these charters might revisit some areas covered by our other, specification-based tests. However, because the testers will be taking side trips that wouldn’tbe in the scripts, the coverage of the scenarios will be broader. Now, what is our assumption about the testers who will use these charters?Obviously, they have to be experienced testers because the test specificationprovided is very limited and provides a lot of discretion. They also have tounderstand the application well because we are not giving them any instruc-tions on how to carry out the charters. They also understand PC technology atleast well enough to restore a PC configuration from a drive image, though it
    • 252 4 Test Techniques ISTQB Glossary software attacks: Directed and focused attempt to evaluate the quality, especially reliability, of a test object by attempting to force specific failures to software attacks occur. would be possible to give unambiguous directions to someone on how to do that. In terms of the priority to each charter, we have listed them in priority order. Notice that we start with the simplest case, a successful application with no insurance, and then add complexity from there. Our objective is to use the exploratory tests during the scripted tests, in parallel. As the test coverage under the scripts gets greater and greater, so also the complexity of the exploratory scenarios. As for results reporting, we would have each charter tracked as a test case in our test management system. For bug reports, though, it would be very impor- tant that the tester perform adequate isolation to determine if the configuration, the persona, the banking relationship, or the functionality itself was behind the failure. 4.4.5 Software Attacks Finally, we come to a technique that you can think of as an interesting synthesis of all the defect- and experience-based techniques we’ve covered in this section, software attacks. It combines elements of defect taxonomies, checklist-based tests, error guessing, and exploratory tests. Conceptually, you can think of a software attack as a directed and focused form of testing that attempts to force specific failures to occur. As you can see, this is like a defect taxonomy in that way. However, it’s more structured than a defect taxonomy because it is built on a fault model. The fault model talks about how bugs come to be and how and why bugs manifest themselves as failures. We’ll look at this question of how bugs come to be in a second when we talk about the bug hypothesis. This question of manifestation is very important. It’s not enough to suspect or believe a bug is there. In dynamic testing, since we aren’t looking at the code, we have to look at behaviors.
    • 4.4 Defect- and Experience-Based 253 Now, here comes the similarity with checklists. Based on this fault model,James Whittaker and his students at Florida Tech (who originated this idea)developed a simple list of attacks that go after these faults. This hierarchical listof attacks—organized around the ways in which bugs come to be—providesways in which we can force bugs to manifest themselves as failures. To derive tests, you analyze how each specific attack might apply to the sys-tem you’re testing. You then design specific tests for each applicable attack. Theanalysis, design, and assessment of adequate coverage are discretionary anddependent on the skills, intuition, and experience of the tester. The techniqueprovides ideas on when to apply the attack, what bugs make the attack success-ful, how to determine if the attack has forced a bug into the open as a failure,and how to conduct the attack. However, two reasonable and experiencedtesters might apply the same attack differently against the same system andobtain different results. So, from where does the fault model say bugs come? The underlying bughypothesis is that bugs arise from interactions between the software and itsenvironment during operation and from the capabilities the software possesses.The software’s operating environment consists of the human user, the file sys-tem, the operating system, and other cohabitating and interoperating softwarein the same environment. The software’s capabilities consist of accepting inputs,producing outputs, storing data, and performing computations. user interface interop app cohab app interop app application cohab app cohab app under test interop app interop app file system operating system DBMS kernelFigure 4–49 Application in its operating environment
    • 254 4 Test Techniques In figure 4-49, you see a picture of the application under test in its operating environment. The application receives inputs from and sends outputs to its user interface. It interacts in various direct ways with interoperating applications; e.g., through copy-and-paste from one application to another or by sending data to and retrieving data from an application that in turn manages that data with a database management system. The application also can interact indirectly by sharing data in a file with another application. Yet other applications, with which it does not interact, can potentially affect the application—and vice versa—due to the fact that they cohabit the same system, sharing memory, disk, CPU, and network resources. The application sends data to and from the file system when it creates, updates, reads, and deletes files. It also relies on the operating system, both its libraries and the kernel, to provide various services and to intermediate interac- tion with the hardware. So, how can we attack these interfaces? To start with the file system, the technique provides the following attacks: ■ Fill the file system to capacity. In fact, you can test while you fill and watch the bugs start to pop up. ■ Related to this is the attack of forcing storage to be busy or unavailable. This is particularly true for things that do one thing at a time, like a DVD writer. ■ You can damage the storage media, either temporarily or permanently. Soil or even scratch a CD. ■ Use invalid file names, especially file names with special characters. ■ Change a file’s access permissions, especially while it’s being used or between uses. ■ And, one of Rex’s favorites—for reasons you’ll find out in a moment—vary or corrupt file contents. For interoperating and cohabiting software interfaces, along with the operating system interfaces, the suggestions are to try the following attacks: ■ Force all possible incoming errors from the software/OS interfaces to the application. ■ Exhaust resources like memory, CPU, and network resources. ■ Corrupt network flows and memory stores.
    • 4.4 Defect- and Experience-Based 255All of these attacks can involve the use of tools, either homegrown, freeware, orcommercial. You’ll notice that much of the technique focuses on the file system. Wethink this is a shame, personally. We would like to see the technique extended tobe more useful for system integration testing, particularly for test types likeinteroperability, end-to-end tests, data quality/data integrity tests with shareddatabases, and the like. As it is, the technique’s interface attacks seem very muchfocused on PC-based, stand-alone applications. application under test computation inputs outputs dataFigure 4–50 Capabilities to attackIn figure 4-50, you see a picture of the application under test in terms of itscapabilities. The application accepts inputs. The application performs computa-tions. Data go into and come out of data storage (perhaps being persisted to thefile system or a database manager in the process, which we’ve not shown here).The application produces outputs. So, how can we attack these interfaces? In terms of inputs, we can do the fol-lowing:■ We can apply various inputs that will force all the error messages to occur. Notice that we’ll need a list of error messages from somewhere: user’s guide, requirements specification, programmers, or wherever.■ We can force the software to use, establish, or revert to default values.■ We can explore the various allowed inputs. You’ll notice that equivalence partitioning and boundary value analysis are techniques we can draw upon here.■ We can overflow buffers by putting in really long inputs.
    • 256 4 Test Techniques ■ We can look for inputs that are supposed to interact and test various combinations of their values, perhaps using techniques like equivalence partitioning and decision tables (when we understand and can analyze the interactions) or pairwise testing and classification trees17 (when we do not understand and cannot analyze the interactions). ■ We can repeat the same or similar inputs over and over again. In terms of outputs, we can do the following: ■ We can try to force outputs to be different, including for the same inputs. ■ We can try to force invalid outputs to occur. ■ We can force properties of outputs to change. ■ We can force screen refreshes. The problem here is that these attacks are mostly focused on stuff we can do, see, or make happen from the user interface. Now, that’s fertile ground for bugs, but there are certainly other types of inputs, outputs, data, and computations we’re interested in as testers. Some of that involves access to system interfaces— which is definitely the province of the technical test analyst. 4.4.5.1 An Example of Effective Attacks As you might guess, Rex has to use PowerPoint a lot. It’s an occupational hazard. If this sounds like less than a ringing endorsement, it’s because PowerPoint interoperates poorly with other Office applications and suffers from serious reliability and data integrity problems. During the time he was creating part of a live presentation for a live course, he accidentally perpetrated two attacks on PowerPoint that resulted in some serious grief for himself. The first attack occurred when he tried to use some features in Word, Excel, and PowerPoint that share data. Specifically, what he did was copy tables to and from PowerPoint, Word, and Excel. Now, Word and Excel shared the tables well. However, PowerPoint did not. In fact, he had to resort to things like copying and pasting from Notepad (stripping off formatting) and copying and pasting column by column. 17. Pairwise testing and classification trees are both covered in Rexs book Advanced Software Testing, Vol. 1.
    • 4.4 Defect- and Experience-Based 257 Now, since he was copying formatted text from Word to PowerPoint, partic-ularly putting that text into text boxes, he found another old nemesis in Power-Point—doing that can cause unrecoverable errors in PowerPoint files. All of asudden, in the middle of your work, you get a message that says your file is cor-rupted. You have to exit. All data since the last save are lost. And, worse yet,sometimes the data that were saved are actually the data causing this problem! To get a better sense of this, you could deliberately run the attack of varyingor corrupting file contents. Rex has a program that will allow him to randomlychange as few as 1 or 2 bits in a file. Now, if he uses this on a PowerPoint presen-tation, that will often render the file unreadable, though special recovery soft-ware (sold at an extra charge, naturally) can recover all of the text. However, you don’t need to use his special program to corrupt your Power-Point file. Just assume that you can share data across Office and other PC appli-cations and try to do exactly that. Do it long enough, and PowerPoint willclobber your file for you. This situation where PowerPoint creates its own unrecoverable error andfile corruption is particularly absurd. You get an error message that says, “Pow-erPoint has found an error from which it cannot recover” or something likethat. Sorry, but how can that be? It is its own file! No one else is managing thefile. No one else is writing the file. If an application does not understand its ownfile format and how to get back to a known good state when it damages one ofits own files, that application suffers from serious data quality issues.4.4.5.2 Other AttacksSo far, we’ve been describing the specific techniques for attacks referenced inthe Advanced syllabus, which is Whittaker’s technique. However, you should beaware that if you are testing something other than reliability, interoperability,and functionality of stand-alone, typically PC-based applications, there areother types of attacks you can and should try.For example, here is a list of basic security attacks:■ Denial of service, which involves trying to tie up a server with so much traffic that it becomes unavailable.■ Distributed denial of service, which is similar, is the use of a network of attacking systems to accomplish the denial of service.
    • 258 4 Test Techniques ■ Trying to find a back door—an unsecured port or access point into the system—or trying to install a back door that provides us with access (which is often accomplished through a Trojan horse like an electronic greeting card, an e-mail attachment, or pornographic websites). ■ Sniffing network traffic—watching it as it flows past a port on the system acting as a sniffer—to capture sensitive information like passwords. ■ Spoofing traffic, sending out IP packets that appear to come from somewhere else, impersonating another server, and the like. ■ Spoofing e-mail, sending out e-mail messages that appear to come from somewhere else, often as a part of a phishing attack. ■ Replay attacks, where interactions between some user and a system, or between two systems, are captured and played back later, e.g., to gain access to an account. ■ TCP/IP hijacking, where an existing session between a client and a server is taken over, generally after the client has authenticated. ■ Weak key detection in encryption systems. ■ Password guessing, either by logic or by brute-force password attack using a dictionary of common or known passwords. ■ Virus, an infected payload that is attached or embedded in some file and then run, causes replication of the virus and perhaps damage to the system that ran the payload. ■ A worm, similar to a virus, can penetrate the system under attack itself— generally through some security lapse—and then cause its replication and perhaps damage. ■ War-dialing is finding an insecure modem connected to a system, which is rare now, but war-driving is finding unsecured wireless access points, which is amazingly common. ■ Finally, social engineering is not an attack on the system, but on the person using the system. It is an attempt to get the user to, say, change her password to something else, to e-mail a file, etc. Security testing, particularly what’s called penetration testing, often follows an attack list like this. The interesting thing about such tests is that in security test- ing, as in some user acceptance testing, the tester attempts to emulate the “user”—but in this case the user is a hacker.
    • 4.4 Defect- and Experience-Based 2594.4.5.3 Software Attack ExerciseConsidering the user interface for the Telephone Banker:1. Outline a set of attacks that could be made.2. For each attack, depict how you might make that attack.For each attack, list the type of defect you are trying to force .4.4.5.4 Software Attack Exercise DebriefWe would contemplate the following attacks:■ Trigger all error messages. We would get a list of error messages that are defined by the developers, analyze each message, and try to trigger each one. The defect we are trying to isolate is where the developer tried